Info

to the pattern of the Hidden units, producing a smooth response curve centered on the reinforced interval.

FIGURE 2.3 Diagram of the general timing model. The central panel shows the architecture of a simple version of the general timing model, and each layer of the network is labeled to the left. To the right are sample activation curves for the nodes in each layer, with brief explanations.

FIGURE 2.3 Diagram of the general timing model. The central panel shows the architecture of a simple version of the general timing model, and each layer of the network is labeled to the left. To the right are sample activation curves for the nodes in each layer, with brief explanations.

FIGURE 2.4 Activity diagram of the general timing model. This diagram provides a concise summary of the architecture and functioning of the model. Each of the three equations is evaluated at each time step to provide the level of activity for that node for that time step. The decay constant was 0.935 in all simulations.

FIGURE 2.4 Activity diagram of the general timing model. This diagram provides a concise summary of the architecture and functioning of the model. Each of the three equations is evaluated at each time step to provide the level of activity for that node for that time step. The decay constant was 0.935 in all simulations.

Nodes in the hidden layer receive activation from the input layer filtered by the weights of the connections between the two layers, as illustrated in Figure 2.4, Equation 2. This causes these nodes to build up activation over the course of the interval, providing a simple trace clock. Because each node starts with a random set of weights, each node builds up activation in a slightly different fashion. The combination of these different activation curves acts as a complex trace clock.

The output layer receives activation from the hidden layer nodes, and the activation level of the output layer is the overall response strength of the model. The weights between the hidden and output layers control what effect the different nodes have on the response strength, as illustrated in Figure 2.4, Equation 3. For example, a hidden layer node that builds up quickly might have a strong negative impact on responding, because responding early in the interval is unlikely to provide food. Similarly, a hidden node that builds up very slowly might be most active after the reinforcement interval has passed and would therefore act to suppress responding.

The level of activation of the output layer can be taken as the general probability of response. It is certainly possible that there is some minimum threshold below which there is no probability of response. Such a threshold was not used in this model because it was not necessary to observe the basic timing phenomena. If a threshold were used, it would produce the break-run-break pattern of responding found in some timing procedures (e.g., Cheng et al., 1993; Church et al., 1994).

2.3.3 Learning Algorithm

The learning algorithm is a slightly modified version of the common backpropa-gation algorithm. This algorithm works by assessing the amount of error in the model's response and then partitioning the error to determine which weights and nodes need to be adjusted to minimize the error. A more technical review of the

FIGURE 2.5 Learning diagram of the general timing model. The error for the output node is used to adjust the weights between the hidden and output layers. The error is also divided among the hidden layer nodes and used to adjust the weights between the input and hidden layers. ETA is a learning constant.

algorithm and all of its variants can be found in Haykin (1994). The modified backpropagation algorithm takes place in four steps. First, the amount of error of the output node must be calculated as illustrated in Figure 2.5, Equation 4. This error represents the difference between the amount of reinforcement the model received and the amount it expected to receive, which is represented by the current activation of the output node.

This output error is then used to adjust the weights between the hidden and output layers, as illustrated in Figure 2.5, Equation 3. If the model has overpredicted how much reward it would receive, the weights for hidden nodes that stimulated the output node will be decreased and the weights for nodes that suppressed the output node will be increased. If the model has underpredicted reward, the opposite will happen. Over time, these weights become relatively fixed as the model becomes better at predicting reinforcement.

After that, each of the hidden nodes will be apportioned part of the error from the output node, as illustrated in Figure 2.5, Equation 2. These values represent how much each of the hidden nodes contributed to the overall over- or underpre-diction of the network. They are then used to adjust the weights between the input and hidden layers, as illustrated in Figure 2.5, Equation 1. These weights control how fast activation builds up in the hidden layer. If the hidden layer nodes are underpredicting reward, adjusting these weights upward causes the model's hidden nodes to gain activation faster and thus reach a higher prediction of reward sooner. The difference between this and the standard backpropagation algorithm is that the standard algorithm includes an extra term in the error calculations designed to push the model toward weights of 1 and 0. This was removed because this model is intended to produce a smooth response curve, rather than discrete, all-or-nothing responses.

FIGURE 2.5 Learning diagram of the general timing model. The error for the output node is used to adjust the weights between the hidden and output layers. The error is also divided among the hidden layer nodes and used to adjust the weights between the input and hidden layers. ETA is a learning constant.

Hidden Layer Size

S 015

0 0

Post a comment