## Info

Proportion of Interval

FIGURE 2.23 Scalar property simulation. The model produces an approximately scalar response curve over a range of 12 to 120 time steps. Outside this range, the function becomes distorted.

Proportion of Interval

FIGURE 2.23 Scalar property simulation. The model produces an approximately scalar response curve over a range of 12 to 120 time steps. Outside this range, the function becomes distorted.

a day) have shown timing that is more precise than would be predicted by scalar timing. Even at shorter timescales, not all variables are strictly proportional to the reinforcement interval (e.g., Zeiler and Powell, 1993). While scalar timing may not be absolute, it is a useful heuristic in thinking about how animals time. All existing models of interval timing demonstrate this property, and the method for scaling represents one of the primary defining features of any model.

Scaling in the general timing model is accomplished by adjusting the weights between the input layer and the middle layer. These weights govern the rate at which the hidden layer nodes become activated. If they build up quickly, the model can learn to respond to very short intervals. With very small weights they build up slowly, allowing the model to respond at longer intervals.

The model's ability to scale can be seen in Figure 2.23. For reinforcement intervals between 12 and 120 time steps, the model scales fairly well. Response rates are typically low during the first quarter of the interval and then accelerate rapidly. At intervals under 12 time steps, the peak response time is later than the interval. With intervals longer than 120 time steps, the peak response time always occurs before 120 time steps in the interval, as illustrated in Figure 2.24.

At shorter intervals the fact that the simulation must break time into discrete time steps becomes a problem. Because we only have one data point per time step, it makes discerning a clear curve difficult. It also becomes difficult for the model to build up activity in the hidden layer fast enough to produce a full response curve. This is not necessarily a failing of the model as much as an inevitable artifact of computer simulation. It could be corrected by saying that time steps are small enough (say, ten per second) that one would hit reaction time limits before this artifact would matter.

However, this would mean that a 30-sec interval would have 300 time steps, which runs into the upper limit of the model's ability to scale. The learning algorithm seems unable to shift the peak response rate past 120 time steps. This seems to be

Proportion of Interval

FIGURE 2.24 Scalar property failure. The model produces an approximately scalar response curve over a range of 12 to 120 time steps. Outside this range, the function becomes distorted. This figure shows the distorted peak-interval functions produced by 6 and 180 time step intervals, plus a properly scalar 60 time step interval for comparison.

### Proportion of Interval

FIGURE 2.24 Scalar property failure. The model produces an approximately scalar response curve over a range of 12 to 120 time steps. Outside this range, the function becomes distorted. This figure shows the distorted peak-interval functions produced by 6 and 180 time step intervals, plus a properly scalar 60 time step interval for comparison.

a product of the fact that the weights between the input and hidden layers would have to be very small in order for activation to build up slowly enough for longer intervals. The learning algorithm seems incapable of adjusting those weights properly at very small values.

### 2.4.3.4 Possible Techniques for Scalar Timing

The simplest method for creating perfect scaling is to make the time steps of the model proportional to the reinforcement interval. This has been done in several models, most notably in BeT. There is no need to demonstrate that this method would work because it is essentially assuming the scaling happens. By scaling the time steps, we automatically scale the timing function and create the result. This mechanism is ill suited to the general timing model. The length of a time step is not something that can be changed by a neural network while it is running. The time step is analogous to a physical constant, a fixed principle of the universe the model inhabits. The model can change what it does each time step, but not the substrate in which it functions. This is not a constraint of this particular model, but of neural network models in general. There is simply no mechanism within this type of model to adjust the overall rate at which time flows.

A stronger position can also be taken that scaling by adjusting the time steps is theoretically bankrupt. Any mechanism that could adjust the time step length could also be made to adjust the model itself, and adjusting the model is more parsimonious. In terms of the performance of the model, there is no difference between the two. All models already assume that some sort of learning takes place, and scaling should take place within that already existing mechanism for learning. Having scaling time steps take place outside the existing system allows the modeler to make it happen by fiat, rather than by building it into the model.

Another alternative, used by all of the neural timing models mentioned earlier, is to include in the model an array of preset timescales and allow the model to use whichever suits the situation. In the spectral timing model, this is done through having an array of nodes with different parameters, each building up at a slightly slower rate than the last. In the connectionist SET model, each oscillator has its own rate. In the CBeT model, the units are connected in series, so the first few units are only active at short timescales and the last few units are only active at long timescales. This makes learning the correct scale very easy, because the portion of the model that runs at the correct rate will always predict the reinforcement better than the rest of the model.

In the general timing model, the easiest way to do this would be to make some of the hidden nodes act slower than the rest. A node that learned only once every five time steps would essentially live and learn at a timescale five times as long as its neighbors'. For this node, a 300-sec schedule would really be a 60-sec schedule, well within the range for which the model scales. If each node in the hidden layer ran at its own rate, this would provide a wide range of timescales for learning. This solution has not been employed in this model to date because it requires an additional layer of assumptions. These assumptions are reasonable and in accordance with what we know of animal learning, but they would still be adding another layer of complexity to the model. The focus of this chapter is to explore how much of animal timing can be explained with a minimally changed general learning model.

A related solution is that used by artificial neural network researchers attempting to model complex time series. Day and Davenport (1993) incorporated time delays into each node of the network that can be adjusted in a manner similar to that of the network weights. Some of the hidden layer nodes can adjust their delays to match the scale of the reinforcement interval, allowing the model to produce a properly scaled output function. This solution again requires an additional layer of assumptions and is therefore beyond the scope of this project.

The best potential method for creating scaling is modifying the learning algorithm. If the various weights are preset and fixed, the model can produce the correct timing function at a wide variety of timescales. The question then becomes: What about the learning mechanism needs to be changed in order to drive the model toward those solutions? At both extremes, the problem seems to be that the changes to the weights are not proportional to the weights. When the required interval is very small, the weights need to be larger than the learning algorithm seems able to make them. When the interval is very large, the upward adjustment of weights by reinforcement seems to make them too large. To date, attempts to make the changes proportional to their current values have not been successful in allowing the model to scale over a wider range. This does not mean that the solution is wrong, just that implementing it is not trivial. Hopefully, future explorations of this timing model can solve this problem.

### 2.4.3.5 Noise and Variation

One of the other scalar features of interval timing is that the variation in the shape of the timing function scales as well. This does not occur in this model because the timing function is basically static for any given interval length. However, if the output of the timing function represented the probability of the animal emitting a response, then the variations in the function would scale as well.

## Post a comment