## Scalar Timing and Patch Departure The Marginal Value Theorem

Kacelnik and Todd (1992; Todd and Kacelnik, 1993) set up an operant analog of the MVT scenario in order to pursue the details of how foraging pigeons respond m*

to travel time. A red flashing light signaled to the pigeon that food was available in the patch. As soon as the pigeon pecked the light, it changed to a steady red light, and a white light was also turned on. At this point the pigeon could choose to peck either the red light to obtain food in the patch or the white light to leave the patch and initiate the travel component of the schedule (that was actually a waiting time in this task). The food in the patch was delivered according to a progressive-interval schedule, such that the delay between food items increased with each successive prey delivered (as required for the MVT to apply). Using this schedule, Kacelnik and Todd (1992) could change various features of the travel component of the schedule and measure the effect that their manipulations had on the dependent variable, which was the number of prey per patch visit (PPV) taken by the bird before it pecked the white key and initiated the next travel component of the schedule.

In their first study, Kacelnik and Todd (1992) chose to test the MVT prediction that PPV should be sensitive only to mean travel time and should not be affected by variance in travel time. They tested pigeons in three treatments, all of which had mean travel times of 95 sec, but which differed in the variance in travel time. Treatment 10t consisted of a random order of ten different travel times with a coefficient of variation of 60.5%, treatment 2t consisted of a random order of two travel times with a coefficient of variation of 95%, and treatment 1t consisted of a single travel time. Contrary to the predictions of the MVT, the distribution of travel times had an effect on the mean PPV, with the birds having the highest PPV in the 1t treatment, an intermediate PPV in the 10t treatment, and the lowest PPV in the 2t treatment. Thus, PPV decreased as the coefficient of variation in travel time increased.

In order to explain the observed inverse relationship between PPV and variance in travel time, Kacelnik and Todd (1992) considered a modification of the MVT in which the representation of travel times in memory was not a perfect mean, as assumed in the unconstrained MVT, but was instead a distribution, as assumed in the scalar timing model. As shown in Figure 5.2, the travel time in the 1t treatment will be represented by a symmetrical memory representation, but the 10t and 2t distributions become progressively asymmetrical and skewed to the right. Kacelnik and Todd (1992) combined this memory representation of travel times with the MVT by assuming that in order to form an estimate of the background rate of energy intake available in the environment, the bird draws a single random sample from its reference memory holding the representation of experienced travel times. This value is then used to calculate the optimal PPV in the current patch. Due to the skew in the representations of the variable travel times, random samples drawn from the 10t representation will have an average shorter than 95 sec, and samples drawn from the 2t representation will have an average that is shorter still. Because shorter travel times will lead to a smaller optimal number of PPV, the modified MVT model is capable of explaining the observed effects of variance in travel time. In addition to explaining the effects of variance in travel time on PPV, the modified MVT model also explains another feature of the data not accommodated by the unconstrained MVT. The unconstrained MVT predicts that in the 1t treatment there should be no variance in the PPV taken by a bird; however, in reality the pigeons did have variance in their PPV in the 1t treatment. The scalar timing modification of the MVT explains this variance, because even a single travel time is represented in memory as a distribution from which random samples are taken for the purposes of decision making.

Despite the successes of the modified MVT, molecular analysis of the data from the 2t and 10t treatments revealed an important result that was not predicted by either unconstrained or modified MVT. When Kacelnik and Todd (1992) examined the PPV taken by a bird in relation to the previous travel time the bird had experienced, they found a significant positive effect, with the pigeons taking more PPV if the previous travel time had been long. Thus the pigeons appear to be not only responding to the mean and variance of the mixture of travel times, but also weighting the most recently experienced travel time more highly in their decision making. Todd and Kacelnik (1993) confirmed this finding in a subsequent study using the same paradigm that was designed to explicitly address the relative roles of the mixture of travel times and the most recently experienced travel time on PPV. They tested pigeons in two treatments that differed in mean travel time but had similar coefficients of variation (60.7 and 67.9% for the short and long mean treatments, respectively). Importantly, two of the travel times (1 and 13 sec) were contained in both mixtures. Todd and Kacelnik (1993) studied the effects of travel time by comparing the PPV after various travel times within each treatment and by comparing the PPV after equal travel times (1 and 13 sec) between treatments. The within-treatment analysis showed that in the long mean travel time treatment, PPV was correlated with the previous travel time, as they had found in the previous study. When the same travel times were compared between treatments, the pigeons were found to take higher PPV in the long mean travel time treatment than in the short mean travel time treatment. Because this effect cannot be accounted for by the within-treatment effect of the previous travel time, it implies that there is a different and independent effect on PPV of the mixture of travel times.

Because scalar timing theory is a steady-state model that makes no statements about how the reference memory representations are built up, it currently does not address short-term changes in behavior due to very recent experience. However, from a functional point of view it makes sense that the adaptive length of a memory window could vary depending on the stability of the environment. In very stable environments, it would make sense to base decisions on all of the available experience; however, in more changeable environments, it might be adaptive to weight recently acquired information more heavily than information acquired longer ago. In order to remedy this problem, Todd and Kacelnik (1993) developed a dynamic version of their previous model that involved adding an explicit learning algorithm to scalar timing theory and combining this with the MVT. Their aim was to develop a model that could handle the parallel effect of both recent and longer-term memory on foraging decisions that was suggested by their empirical data. Scalar timing theory provided a natural way to approach this problem because its parallel structure, whereby samples may be read at the same time from working and reference memory, allows a way to separate the effects of current percepts of time from remembered experience.

Todd and Kacelnik's new model contains two innovations that allow recent experience to have greater impact on decision making. First, the reference memory representation is built up and continually modified according to recent experience.

The reference memory is defined as a probability density function with bins corresponding to travel times each assigned a probability, and the total area under the function always equal to 1. Following each travel, the reference memory is updated in two steps. First, a fraction (a) of the area under the probability function is subtracted by devaluing each bin in proportion to its probability value at the time, such that the sum of the devaluations equals a. Second, an area the size of a is added back to the probability in the bin corresponding to the current travel time in working memory. Thus, following updating, the total area under the probability density function remains at unity, but the shape of the distribution is shifted toward the most recently experienced travel time, with the size of this shift controlled by the value of the parameter a. Low values of a correspond to little weight being given to recent experience, as would be predicted in a stable environment, whereas high values of a correspond to a high weight being given to recent experience, as would be predicted in changeable environments. The second innovation involves the decision subsystem. Rather than just using a random sample of travel time from reference memory in order to choose the PPV for the patch, Todd and Kacelnik (1993) assumed that value of travel time used was a weighted average of the value currently in working memory representing the most recently experienced travel time and a random sample taken from reference memory. They introduced a second parameter, p, that controlled the relative weight given to the values from working and reference memory. Just as for a, low values of p correspond to little weight being given to recent experience, whereas high values correspond to a high weighting of recent experience. As in their previous model, the weighted average of the two travel times was used as the input to the MVT to produce the optimal PPV.

Simulations of the above model run for the two different treatments experienced by the pigeons produced results that mimicked both of the main empirical results: the model produced both the observed treatment difference, with higher PPV in the high mean travel time patch than in the low mean travel time patch, and the within-treatment molecular effect of higher PPV directly following longer travel times.

Again, this study provides a clear example of the benefits of the ethological approach. Without the scalar timing model, we did not have an explanation for why pigeons should respond differently to fixed and variable travel times of the same mean. However, the scalar timing model shows that these effects occur because travel times are stored in reference memory as a distribution rather than as a mean, and as we have seen previously, the distribution representing a fixed travel time is symmetrical, whereas the distribution representing a variable travel time is asymmetrical (see Figure 5.2). The contribution of the evolutionary approach is the realization that it may not always be adaptive to weight all previous experience equally, and therefore the scalar timing model needs mechanisms that control the weighting given recent and past experience.

0 0