Info

2.4.2 Learning Phenomena 2.4.2.1 Learning the Time Marker

One of the defining features of this model, compared to other timing models, is that it can determine which of its inputs is the timing signal. This process of deciding which stimuli are related to which outcomes is sometimes referred to as the assign-ment-of-credit problem (Minsky, 1961). One of the primary issues of artificial intelligence research and all other artificial learning research, it has been essentially ignored by timing researchers. This problem was simulated by presenting the model with ten inputs, one of which was the true timing signal. The others changed between off and on randomly, independent of each other or the reinforcement schedule. As seen in Figure 2.15, the model learned to accurately predict the reinforcement and to ignore the irrelevant stimuli.

2.4.2.2 Parsimonious Timing

One anecdotal finding in timing research is that subjects presented with multiple timing signals will generally use the one closest in time to the reward. This is logical because the absolute size of timing errors will be smaller with the shorter interval between the time marker and the reinforcement. In this simulation, the model was presented with two timing signals. The first started 60 time steps before the reward

Timestep

FIGURE 2.15 Assignment-of-credit simulations. These simulations were the same as the fixed-interval simulations with the addition of eight input nodes that turned on and off randomly. Four simulations were run with different probabilities of the random input nodes turning off and on each time step: 1/10, 1/30, 1/60, and 1/120. These data represent the average of the 51st through 55th trials.

was presented; the second started 30 time steps before the reward. Both signals accurately predicted the reward, and both were equally salient. The result can be seen in Figure 2.16. The model begins responding much later than it would if it was trained with only the long signal. This indicates that it does naturally gravitate toward the most parsimonious timing signal.

2.4.2.3 Initial Fixed-Interval Behavior

In general, current timing models excel at describing how animals respond after extensive training. This steady-state behavior is relatively static and predictable and does not require a real-time model. However, there are distinct patterns of responding found in the first few trials under an FI schedule. Before being exposed to the fixed-interval schedule, most animals are trained to respond using a continuous reinforcement (CRF) schedule, in which they are rewarded for every response. Once they are responding at a sufficient rate, the subjects are switched to the FI schedule. On the first trial or two after the switch, subjects typically produce an inverted scallop pattern, initially responding at a very high rate and then tapering off. Over successive

FIGURE 2.15 Assignment-of-credit simulations. These simulations were the same as the fixed-interval simulations with the addition of eight input nodes that turned on and off randomly. Four simulations were run with different probabilities of the random input nodes turning off and on each time step: 1/10, 1/30, 1/60, and 1/120. These data represent the average of the 51st through 55th trials.

0 0

Post a comment