## Scalar Timing and Sampling a Changing Environment

Shettleworth et al. (1988) tested Stephens' (1987) sampling model, described previously, by setting up an operant analog of the sampling problem in the laboratory. Their experiment was carried out with pigeons in a shuttlebox in which the two feeding sites were represented by feeders and keys at either end of the box. On the stable side of the box, a random ratio schedule delivered rewards with some probability that remained unchanged throughout an experimental treatment. On the fluctuating side of the box, the schedule varied between no reward and a random ratio schedule equal to or lower (i.e., delivering more frequent rewards) than the stable side. Two different colored lights provided information about the state of the fluctuating side as soon as the bird pecked there. Following each reinforcement, the pigeon was required to return to the center of the box to initiate a new trial. Changes in the state of the fluctuating side occurred with a probability of .002 per trial. Shettleworth et al. (1988) investigated how the pigeons allocated their choices between the stable and fluctuating sides of the box in a range of treatments in which both the probability of reinforcement in the stable patch and the probability of reinforcement in the good state of the fluctuating patch were manipulated.

As predicted by Stephens' optimality model, the results showed that when the fluctuating patch was bad, pigeons chose the stable patch most of the time, only infrequently sampling the fluctuating patch. However, if on a sampling visit the fluctuating patch was in its good state, the pigeon would stay there until it reverted to its bad state. Furthermore, as predicted by the optimality model, the pigeons increased their rate of sampling the fluctuating patch as the probability of reinforcement in the constant patch was decreased. However, the behavior of the pigeons deviated in three important ways from the predictions of the unconstrained optimality model. First, sampling did not occur at the regular intervals predicted by Stephens' model, but instead occurred at random intervals. Second, sampling frequency was not sensitive to the probability of reinforcement in the good state of the fluctuating patch, and sampling still occurred when the probability of reinforcement in the good state of the fluctuating patch was equal to that in the stable patch. Finally, when the fluctuating patch was in its good state, the birds occasionally visited the constant patch, thus reducing their rate of food intake.

Because scalar timing theory was developed to model experiments in which rewards occur on the basis of time, but in this experiment Shettleworth et al. (1988) used ratio schedules in which rewards were delivered on the basis of responses, it was necessary to make the assumption that the pigeons responded at a constant rate in order to use scalar timing theory to model the data. Having made this assumption, the problem faced by the pigeon converts into a simple choice between two different distributions of delays to reward. The pigeon can simply be viewed as asking, "What is the delay to food if I stay with the current stable side vs. what is the delay to food if I sample?" According to scalar timing theory, the memory representation of the stable side will be an exponential distribution. The memory representation of the fluctuating side is more complex, and Shettleworth et al. (1988) assume that the time to food can be simplified to the sum of the time in the bad state until the good state begins and the time in the good state until food, which are both assumed to be represented by exponential distributions. As in the risky choice version of scalar timing described in the previous section, the pigeon is assumed to take a random sample from each of its two memory representations and use these to decide which side offers the shortest delay to food. Once the good state on the fluctuating side has been entered, a new reference memory for the fluctuating side applies in which the delay to reward is represented by an exponential distribution of delays as on the stable side. Thus the choice is now between two exponentially distributed delays to reward.

The scalar timing model can provide an explanation for some of the ways in which the data deviated from the unconstrained optimality model. First, sampling is not predicted to occur at regular intervals because decisions are based on random samples from memory distributions. For the same reason, both sampling of the fluctuating patch when the probability of reward is the same as the probability of reward in the stable patch, and reverse sampling of the stable patch when the fluctuating patch is currently in its good state, are predicted to occur, because occasionally the memory sample drawn from the memory for the suboptimal side will suggest a shorter delay to food than the sample drawn from the memory of the optimal side.

## Post a comment