Effects of Image Transformations

One important set of behavioral studies involves psychophysical experiments, where psychological responses to variations in a physical property of object stimuli are recorded. The basic logic of these experiments is that, if a certain stimulus dimension is encoded in the representations used for object perception, then transformations to stimuli along this dimension should cause impairments of performance (i.e., increased response times and/or increased error rates) in object perception tasks.

To take a well-known and oft-replicated example, many researchers have shown subjects pictures of common objects in different orientations, ranging from the object's canonical upright position to an upside down orientation. Example stimuli and the typical pattern of results from these experiments are shown in Fig. 6. In the initial block of trials, subjects usually require more time to recognize an object the farther it is rotated from its upright orientation. However, there is often a small dip in the response time function at 180° (the upside down position), as shown in the graph. Another important feature of these studies' results is that, if the experimental trials are repeated, subjects take considerably less time to recognize the objects in misoriented orientations—the slope of the response time function decreases with practice.

These findings indicate that something about objects' canonical orientations is encoded in the representations used for object perception. This conclusion helps to constrain both view-based and structural description theories. The fact that response time varies with degree of misorientation is explained in many

0° 60° 120° 180° 240° 300° Rotation from Upright

Figure 6

Schematic results from psychophysical experiments in which common objects, such as the car shown at bottom, are presented in various orientations and must be named as quickly as possible by subjects. Three results are evident in the graph: (1) naming response time generally increases as the object is rotated away (either clockwise or counterclockwise) from its canonical upright orientation; (2) there is a small dip in the response time function at 180°; and (3) the slope of this function is reduced if the stimuli are seen again in a second block of trials.

view-based theories as a consequence of a trasforma-tion process necessary to bring a misoriented view of an object into register with the encoded, canonical view. The larger the degree of misorientation, the longer this transformation process (which is likened to mental rotation in some theories) takes, and the longer it will take subjects to recognize the object. To explain the decrease in misorientation effects over test blocks, these theories propose that multiple views of an object can be stored in memory. Once an object is seen lying on its side in the first block of test trials, a representation is encoded of the object in that orientation. When this stimulus is encountered in subsequent blocks of trials, the transformation process is no longer necessary because the test object can be directly compared to the newly encoded representation of the object on its side.

In structural description theories, orientation effects are explained as a function of shifting categorical relations between parts. The structural description of a car will include the proposition that the windshield is ABOVE the front wheel. When the car is rotated 90° in the picture plane though, the wheel will be BESIDE the windshield, causing, as in view-based theories, an imperfect match between perceived and encoded representations, which leads to an increase in response times. The same type of mechanism also provides an interesting account for the dip in the response time function at 180°. Parts that had BESIDE relationships in upright objects (e.g., the front and rear wheels of the car, assuming the car is viewed from the side) shift to ABOVE-BELOW relations when the object is rotated 90°, but return to BESIDE relations when the object is fully upside down. Thus, in some cases the structural description of a 180° rotated object matches the upright object better than does the 90° rotated version.

Note that this account implies that structural descriptions do not include information as to whether one part is to the left or to the right of another—both LEFT and RIGHT relations are coded as BESIDE. If this is the case, then two images that are mirror reflections of each other should share the same structural description, and image reflections should have little, if any, effect on recognition accuracy or response time. As noted in Section I, this pattern of reflection invariance has, in fact, been empirically demonstrated.

Other experiments indicate that size transformations and object translations also generally have minimal effects on object perception responses. Effects of lighting direction are only starting to be studied systematically (before the ready availablility of 3D modeling software, it was difficult to generate stimuli for such experiments), with early results suggesting that lighting changes also cause only small, though systematic, disruptions to object perception processes. Object perception across rotations in the depth plane (i.e., changes in viewpoint) are also the focus of increasing scrutiny since the development of inexpensive computer modeling programs. Consistent with the research on picture plane rotations, most studies have found systematic and substantial effects of viewpoint shifts on object perception response time and error rates.

The relative invariance of object perception over mirror reflection, size, position, and lighting changes appears to favor structural description over view-based theories because, as noted in Section I, these transformations all lead to large changes in images (which are the presumed representations in the simplest view-based theories). However, structural description theories also predict at least limited viewpoint invariance, because structural descriptions should only be altered by a viewpoint shift when parts become occluded or uncovered or when categorical relations are altered. Careful investigation has shown that, even when part accretion and deletion are minimized, viewpoint costs to object perception performance are still incurred. Small picture plane rotations, which should also have no effect on categorical relations, can also have measurable effects on recognition time. Thus, view-based models appear to provide a better account than structural description theories for findings of orientation and viewpoint dependence.

The description of psychophysical research given here is, by necessity, oversimplified. There are numerous exceptions to these generalizations (e.g., object perception has been shown to be orientation-invariant in some circumstances and size-dependent on others), and many investigators have seized upon this ambiguity to argue that view-based representations are used in some object perception contexts and structural descriptions are used in others. Perhaps the most pervasive proposition is that recognition in contexts where all candidate objects are easily discriminable from each other (e.g., distinguishing between a car, a bicycle, and a jogger) relies on structural descriptions, whereas object perception in more demanding circumstances (identifying a sparrow in an area where warblers and finches also live) requires view-based representations. As intuitive as it appears to be, experiments designed to directly test this particular dual-system hypothesis have provided mixed evidence; some studies support it but others do not. Thus, the more exciting possibility remains that either a single view-based or structural description theory, or a model employing some type of hybrid representation, will eventually be able to explain all of the observed patterns of dependencies and invariances across image transformations.

All About Alzheimers

All About Alzheimers

The comprehensive new ebook All About Alzheimers puts everything into perspective. Youll gain insight and awareness into the disease. Learn how to maintain the patients emotional health. Discover tactics you can use to deal with constant life changes. Find out how counselors can help, and when they should intervene. Learn safety precautions that can protect you, your family and your loved one. All About Alzheimers will truly empower you.

Get My Free Ebook


Post a comment