The initial task involves reward behaviour and therefore the amygdala. Hippocampal involvement may be added later. Some components of the spatial navigation network may be reused.
The rat should press the level with a little light on it when the house lights are on.


The current model: operant1.network-20020510.ccm.

In order to arrange the proper order of activity for LTP in this first model, the output stage is being split into two parts: (1) the activity buffer, (2) the output filter. The activity buffer receives the input from the random sequence generator and from the amygdala. Spikes are repeated several times by a combination of fast AHP and ADP. A cumulative slow AHP shuts off the repetition after a short while. The output filter insures that only the first spike is passed on to the actuators, while subsequent repetitions are blocked by an AHP.
This model now manages to learn the correct lever to press, as long as the training data is useful. The training data can be problematic if both levers are randomly pressed at about the same time. For this reason, it might be better to replace the random sequence generator with a specific training sequence.
Since the current model is somewhat simplistic and does not include the plausible episodic learning developed in the spatial navigation models, it is quickly completed as a proof-of-concept. We can then move on to a more elaborate model. The completion of this version is done with the following three updates:
Results: The network now learns that the context "light on" is associated with reward. It also learns which lever press is associated with reward. It correctly recalls the proper lever action when the light is on and a food reward is desired.
Membrane potential response of the context population: The house lights/sound are on between t=100 ms and t=1100 ms in the learning phase, and between t=1480 ms and t=1600 ms in the retrieval phase of the experiment. Corresponding sensory input causes repetitive spiking in a context population.

Training input: Input spikes causing pseudo-random lever pressing choices in the nucleus accumbens (NA) population are depicted as blue (upper lever) and green (lower lever) spikes.

Membrane potential reponses of the amygdala population: Between t=0 ms and t=1100 ms, three occurrences of food reward cause spikes during the training phase of the experiment. Each reward increases the synaptic strength of the associated context input (``lights/sound on''), which is clearly visible as an increase in the amplitude of the subthreshold responses to the context input. At t=1300 ms, t=1400 ms and t=1500 ms, ``desire reward'' inputs are presented to the amygdala, producing broader subthreshold responses in the retrieval phase of the experiment. Between t=1480 ms and t=1600 ms, the lights/sound are again on. The combined response cause a spike in the amygdala, which elicits retrieval.

Membrane potential response of the NA population with short-term memory buffer: Training input that causes pseudo-random lever pressing during the learning phase, as well as recall activity received from the amygdala during the retrieval phase elicit spiking in cells of a NA population that represent upper and lower lever pressing choices. An interaction between after-hyperpolarization (AHP) and after-depolarization (ADP) currents in the hippocampal cells results in short-term memory buffering of spiking activity. The duration of this repetition in the short-term memory buffer is limited by a cumulative slow AHP. Buffering allows LTP to strengthen the synapses from the amygdala to NA cells that correspond with lever pressing activity that resulted in food reward. After the synaptic efficacy with the lower lever cell is strengthened during the learning phase, activity in the amygdala recalls spiking (near t=1500 ms) in the lower lever cell during the retrieval phase of the experiment.

Lever pressing spikes: An output filter insures that repeated NA activity is propagated as unique lever pressing signals.

Lights and levers: The continuous line reponse depicts the state of the house lights/sound, which is on between t=100 ms and t=1100 ms, and between t=1480 ms and t=1600 ms. Upper and lower lever presses are indicated by the blue and green spikes respectively.

An even better model would be more at the ``concept'' level, and therefore a lot more like the spatial navigation model. The amygdala would be a separate portion that deals specifically with the ``desire'' for reward, and therefore the reactivation of known episodic memories, until one is found that may lead to the reward. That episode can be recalled once more, together with physical activity that keeps track of the current resemblance to memory. This is very similar to the retrieval of a spatial path. In a model such as that, every conceptual event (``pressed top lever'', ``pressed bottom lever'', ``received reward'', ``did not receive reward'', and in later experiments also spatial navigation concepts) should have pattern (cell) representation. Those representations are then associated in episodic memories (CA3).
In order to find the successful episodic memory when motivated by the amygdala: Should backward spread be adapted to the task, or should the alternative forward-only approach be implemented in order to be usable both here and in the spatial navigation models?
Make list of sequences of events as plausible to lead rat to do right thing in context of environment with light on and environment with lights off.
Plausible sequences of retrieval:
The next, or simultaneous, task is a model with an 8 arm radial maze, in which two arms are used for the training.
The following diagram presents an example of a more general approach to the coupling of the hippocampal episodic memory system, contextual and motivational retrieval systems, such as are involved in conditional cue preference learning.

Immediate priorities for paragraphs to be included in grant application:
The Kantak model seems to propose the following:
The new model is started as operant2.network-20020522.ccm.gz.
I will start by adapting the environment part of the model, so that the virtual rat remains stationary and so that levers and other components are available. I added an area in which the lever pressing experiment can take place and adjusted the experimental protocol such that the virtual rat is placed there at the onset of the experiment. For the initial tests, we must now restrain the virtual rat to keep it stationary. This protocol is achieved through two event sequencers. The first controls a vector switch, so that at t=0 the virtual rat location "drive" input receives a constant zero vector. The second controls an insertion switch and uses the output with index 2 to place the virtual rat into its stationary location in front of the levers and food reward.
The environmental protocol is implemented, together with a stationary virtual rat. It is now possible to encode the context with sound present (or sound being turned on) as a feature-cell in the EC population. Similarly, the other items in the lower right-hand box of the figure above can be encoded and sequences learned in ECIII and CA3. The feature-discretizer produces place related feature input. Other input, such as the sound context and actions (lever pressing, food consumption, etc.) must be added to those features as indexed spikes. The spikes can be concatenated and then converted to channels through the SpikeSplitter component.
Preprocessing multiple streams of feature input to the entorhinal-hippocampal loop:
The new SpikeConcatenator component (avaiable in Catacomb versions 2.049 and above) is used to bring together spikes from multiple sources as input to the entorhinal and hippocampal systems. Spikes with indices between 0 and 9 convey place field features. Spikes with indices 10, 11 and 12 indicate ``pressed lever 0'' (upper lever), ``pressed lever 1'' (lower lever), ``received reward'' (consumed food). Spikes (at 8 Hz) with index 13 indicate that the light is on. Instead of signalling the presence of light, it is possible to signal the transition to the ``light on'' state, which would correspond to the second context approach described in the figure above. Note that the reward consumption that is included as a feature here corresponds to consumption of the reward that is associated with the lever pressing task, not the T-maze navigation task. In compliance with the new memory demands, the population sizes of all regions (EC-STM, PFC-STM, ECIII, CA3, CA1, etc.) are increased to 14 cells each.
The problem now is that the existing stages of the network assume that incoming feature data is presented as intervals of repetitive spiking, whereas some of the other input appears in the form of single spikes. The two possibilities are (1) to make all input features into spike trains at a given rate, or (2) to concatenate input from different sources at levels in the model that are beyond the filter stages used to convert the spike trains into single spikes. Since the entorhinal-hippocampal loop is assumed to operate on processed data at a relatively high level, it seems best to make the second choice. This means that the additional information may be best concatenated with place information after both the Place-Input-Type1 filter that precedes EC-STM and the ECII-input filter that precedes CA3. The PFC-STM-place-input filter can be regarded as part of a separate path for now. The version of the model to implement these preprocessing stages is stored as operant2.network-20020604.ccm.
To simplify one part of the merger of information, I am setting the refractory period on the threshold component on the sound sensor input to produce a spike rate equal to that of the feature discretizer (50 Hz, 20 ms spike intervals).
I also needed to increase the channel size from 10 to 14 on all VectorBroadener components in the model.
The addition of other features beyond the two filtering populations seems plausible, since the filtering of sensory input may well occur many stages earlier than the arrival of synchronized spikes at the entorhinal and hippocampal layers. If you use this method, you need filters in which the afferent input produces a slow membrane potential response, so that the synchronization spike can synchronize single-spike input. This is a doable version of the precision timing filters.

We hypothesize that the preprocessing of input signals takes place in regions of the brain along the path to the entorhinal-hippocampal loop, so that the signals arriving there are synchronized with theta. The synchronized preprocessed signals may appear as single spikes that signal state transition events (e.g. input to EC STM) or a spike-train at theta frequency that signals a state condition (e.g. ECII input to CA3).
As soon as the SpikeSplitter works correctly, the parameters in the new Single-Spike-Synchronizer populations must be tuned.
Combining hippocampal and amygdalar paths:
CA1 should go to NA, as should amygdala. NA drives activity. Actually CA1 should go to SUB, which can send activity on to NA, but we only really need SUB once we do actual item encoding/recoding from EC through the orthogonalized representations in hippocampus and to SUB and other cortical areas.

We want to reimplement the contextual stimulus-stimulus associative portion that was achieved with the simple network.
We should seek to maintain the modularity of the regional components, so that their function is preserved within any larger network that complies with the modular I/O timing requirements.
We may now enter a new domain, at least in terms of learned weights, as the new experimental protocol may lead to multi-cell contextual or item activity in the hippocampal populations.
We need to determine how learning and retrieval (via ECIII with backwards spread) can achieve the task. (See the proposed scheme in a figure above.)
We must determine the context-stimulus-reward associative portion (conditioned?). (See descriptions by Kat Kantak.)
It may become necessary to move to a feedforward only implementation, and to incorporate the ability to deal plausibly with multi-cell items, and to include PFC rule-learning circuitry.
At this time, we have no specific mechanism in the backwards-spread implementation that allows a specific context to lead to the retrieval of a particular successful episodic memory. This means, that with the backwards-spread implementation, we currently have no way to use the information that ``sound/light is on'' to set up a distinct autoassociative context for retrieval. A feedforward-only implementation would be the most promising solution. In order to achieve some results quickly for inclusion in the grant proposal, we can work with the alternative scheme, in which ``sound/light on'' is an even in an episode within a single environmental context.
Interlude: Update of the simple model:
Following a discussion reviewing progress of the simulations and the write-up of a grant renewal application, several modifications were proposed for the simple model. The simple model is retained for now, as it is more easily explained in the write-up and is ready to produce results.
The model as it is demonstrates context-stimulus association more readily than conditioned-stimulus association, since the ``tone'' stimulus is present as a train of spikes throughout the learning and retrieval phases. Since this model does not focus on the intrinsic operations of the hippocampus, we intend to modify it so that the simple model does demonstrate the conditioned-stimulus associations.
The context-EC population will be retained as the context-stimulus from an abstract hippocampal source, but the context will be provided directly to nucleus accumbens (NA). A conditioned-stimulus signal will be provided to the amygdala population. That signal will be on briefly simultaneous with or immediately after successful lever pressing. Reward signals will be supplied to amygdala and NA. Through learning, the conditioned-stimulus will become as powerful as the reward stimulus, able to drive the behaviour even when the hippocampal context signal is lesioned. In order to elicit behaviour in a retrieval phase when hippocampus has been lesioned, some pseudo-random lever pressing input will be maintained in that phase to initiate lever pressing that provides the conditioned-stimulus for further lever pressing. This new model is stored as operant1.network-20020610.ccm. The abstract hippocampal output may be modified to present activity that responds to the contextual associations with interoceptive sensations of reward.

I will start the modifications by setting up the connections that are needed in the model during retrieval. I will then add connections needed during learning and tune the individual parts.
The abstract hippocampal population is set up so that interoceptive reward sensation causes it to spike during learning. LTP is applied to initially small synaptic efficacies on the ``tone'' context input. After two activations, the efficacy is strong enough to enable spiking in response to the context input. An AHP assures that the hippocampal population produces spikes at 100 ms intervals while the context persists. These spikes can drive drug-seeking and drug-taking behaviour.
The amygdala population is set up so that reward signals cause spiking during learning. The efficacy of the conditioned stimulus input grows as LTP is applied. This is a brief light that follows successful leaver pressing and is represented here by a spike that arrives just prior to the sensation of reward. A spike relay is used, so that spikes can also be delivered after learning by an event sequence generator. Those spikes represent the appearance of the conditioned stimulus in the absence of reward. Following learning, the amygdala population produces spikes in response to the conditioned stimulus that can drive drug-seeking and drug-taking behaviour.
The synaptic efficacies on fibres from the hippocampal population and those from the amygdala population learn associations with the correct lever cell in nucleus accumbens (NA) as LTP is applied in the presence of short-term buffered activity in NA. Either context or conditioned stimuli should then suffice to retrieve the successful drug-seeking and drug-taking behaviour.

I will now demonstrate context-stimulus association driven drug-seeking and drug-taking behaviour following lesioning of the amygdala population, as well as conditioned-stimulus association driven drug-seeking and drug-taking behaviour following lesioning of the hippocampal population.
Note that the drug-seeking and drug-taking behaviour guided by the context-stimulus association may be learned less rapidly than that guided by the conditioned-stimulus association, since contextual spiking occurs at fixed intervals once the context-stimulus association has been established. Spikes from the hippocampal population therefore do not always arrive in close proximity to the exploratory lever pressing spikes in nucleus accumbens.
The desired effect of a lesion that removes the conditioned-stimulus association contribution is that the virtual rat will continue to exhibit the drug-seeking and drug-taking behaviour, but at a reduced rate. This effect can be achieved in several ways:
The second option is chosen, by adding a delay to the time taken to perform the lever pressing. Additionally, the test conditioned stimulus is replaced with random lever pressing that can result in a conditioned stimulus. Question: Why is the conditioned-stimulus association not tested by presenting only the conditioned stimulus, instead of confounding the results with a possible response-response association (lever pressing because lever pressing was done before)?
To do this, a spike delay component is placed on the input to the lever presser. Previously, the reward was not refilled when the conditioned stimulus association was being tested in the absence of the context contribution. The refilling context is now extended so that the experiment can demonstrate multiple conditioned-stimulus responses. (The house lights were previously turned off at 1600 ms with the event "1600 1". That event is now removed.) To demonstrate the rate of response with only the context association (amygdala lesioned), and the rate of response with only the conditioned-stimulus association (hippocampus lesioned), the lesioning intervals are modified. The amygdala is now lesioned between t=2000 ms and t=3300 ms. The hippocampus is lesioned between t=3700 ms and t=5000 ms. With the delay on the lever pressing, it is necessary to increase the time that a lever press response is held in NA, and the time between training presses. Since the buffering in NA is now extended to be able to capture the delayed reward/conditioned-stimulus feedback, the contextual response needed to be slowed slightly and learning of the contextual association slowed down to avoid confusion over the correct lever that the contextual contribution should be associated with.
Continuing with operant1.network-20030211.ccm (in Catacomb version 2.052):
1. separate the list of desired measurements and the list of results, put the results further down in this file 2. note what is being measured (get slices of graph for each and caption with help of text from grant proposal) 3. get a complete graph 4. add the slower memory for consolidation We seek to achieve the following goals:
Problem: A delay of 50 ms now does cause the conditioned-stimulus association to produce another spike response in the amygdala. That response is propagated to NA, but the NA with prolonged buffering is still repeating the previous activity and the output filter is not yet permitting another lever press.
Conflicting needs caused a conundrum for the desired ability to elicit lever pressing at a reduced rate when only the amygdalar pathway is intact. During training, NA must maintain activity until a conditioned stimulus is received as a result of lever pressing, so that an association can be learned. When random lever pressing should elicit the conditioned behaviour at a reduced rate of lever pressing, the conditioned stimulus presentation following the random lever press should cause subsequent lever pressing. At that time, NA is still maintaining previous activity as required during training. It cannot restart at that time and further lever pressing is inhibited by the output filter.
One way to get around this conflict is to implement a secondary pathway from the amygdala, possibly through another brain region that acts as a buffer, presenting amygdalar activity to NA with a delay. This is implemented in an abstract form in operant1.network-20030211.ccm via a SpikeDelayBuffer of 350 ms. During initial training, the second presentation has no effect, since the conditioned-stimulus association is too weak to initiate lever pressing. After training, this conditioned-stimulus presentation causes lever pressing. A problem with this implementation is that the second pathway must reach NA through the same synapses that were strengthened during training.
In order to achieve a higher rate of lever pressing when both pathways are intact, the space between consecutive trains of NA activity caused by either pathway individually must be increased. The context and conditioned stimulus elicited trains of NA activity must be out of phase, and there can be a difference in their frequency to match relative differences found empirically. The hippocampal response frequency can be modified with the least interaction with other functions by changing the AHP of its cells. The repetition frequency caused by the conditioned stimulus can be modified most easily by changing the delay through the secondary pathway circuitry from the amygdala to NA. In order to safely make alterations, the exploratory lever presses during training must be separated by greater intervals. These modifications are implemented in operant1.network-20030215.ccm.
Right now, training is a bit messed up... the effect of the second pathway during training causes reward at a fixed interval, regardless of exploratory presses. But... that shouldn't be such a problem! Just reduce the learning level on the conditioned stimulus (and possibly the context stimulus) somewhat, and also NA learning - as long as connections to NA are not strong, the second spike should not cause another lever press. To a large degree, this is solved by reducing the maximum conductivity that can be learned on the conditioned stimulus to NA connection. I also reduced the learning rate at the connection to NA. I increased the learning rate on input to the hippocampus slightly, so that it also completes its learning on the third correct training lever press.
In order to address point 5. above, we add a slower learning memory (modeled after the hippocampal population) hypothesized to reside in the neocortex. This more gradual learning can represent consolidation processes that may decrease dependence on the hippocampus for behaviour based on associations with contextual cues. This is modeled in operant1.network-20030216.ccm.
The effect of the consolidation can be shown by extending the experiment so that consolidation can be completed, then testing its effect when the hippocampus is lesioned in both the cases where amygdala is and is not present.
Fig. 15: The drug-seeking drug-taking lever pressing experiment with temporary lesions in hippocampus to NA and amygdala to NA pathways. The square functions at the top and near the bottom of the graph show the experimental hippocampal and amygdala lesioning controls respectively. Spiking responses from top to bottom depict hippocampal, NA and amygdala activity. The spiking responses at the very bottom of the graph depict the measured activity of a neocortical memory region that is hypothesized to consolidate context associations. NA activity is composed of two signals, but the signal corresponding to lever 0 pressing is inactive, as reward is given for lever 1 presses. The experiment consists of seven stages. The first stage is initial training with exploratory lever presses between t=0 ms and t=4400 ms. A brief pause from t=4400 ms to t=5500 msseparates this stage from the remainder of the experiment, in which behavioural activity and corresponding neuronal responses are measured. All pathways are intact in the second stage, from t=5500 ms to t=8200 ms. The pathway from the amygdala to NA is lesioned in the third stage, from t=8500 ms to t=11600 ms, while the hippocampal pathway is intact. This is reversed in the fourth stage, with a hippocampal lesion and intact amygdalar pathway from t=11800 ms to t=15000 ms. Between t=15000 ms and t=20000 ms, the fifth stage allows context associations to be consolidated in neocortical storage. In the sixth stage, from t=20000 ms to 22700 ms, as in stage four the pathway from the amygdala to NA is intact and the hippocampal pathway is lesioned, yet consolidated context associations are available in neocortex. Finally, both hippocampal and amygdalar pathways are lesioned in the seventh stage from t=22700 ms to t=25000 ms, so that consolidated context associations are the predominant cause of directed drug-seeking and drug-taking behaviour.
Fig. 16: Responses during training. Three spikes occur in hippocampus during training as an interoceptive sensation of reward is received when lever 1 is pressed in the presence of the context stimulus. Nucleus accumbens activates for lever pressing actions during training. Responses of NA cells 0 and 1 lead to the pressing of lever 0 and 1 respectively. Activity in NA is buffered for a little over 100 ms so that associations with hippocampal and amygdalar activity can be made. Amygdala spikes occur during training when reward is received. This allows an association to be made with conditioned stimulus activity received in the amygdala. The association is tested prior to learning at t=200 ms, resulting in a subthreshold amygdala response. The association is strong after training. This can be seen after t=4000 ms, when activity in amygdala caused by the conditioned stimulus corresponding to a training lever press at t=3900 ms leads to further lever pressing. In addition to the rapid amygdala to NA pathway that established an association during training, activity is hypothesized to propagate through neuronal circuitry that forms a delayed pathway to NA. This delayed propagation of the conditioned stimulus response causes NA to activate around t=4300 ms. The lever press driven by that activity supplies another conditioned stimulus. The consequent response in the amygdala (near t=4350 ms) does not lead to further lever pressing, since the training stage is ended at that time with a temporary lesion of both hippocampal and amygdala pathways.
Fig. 17: Responses in the intact system. During the second stage of the experiment, the hippocampus responds with regular spiking activity to the presence of the context stimulus and the amygdala respons to the conditioned stimulus. Trains of activity in NA are triggered by spiking in amygdala as well as hippocampal regions. Its frequency of activity is limited by a slow after-hyperpolarization of the cell population. Near t=5700 ms, hippocampal activity triggers a spike train in NA that leads to drug-seeking lever pressing. The corresponding conditioned stimulus causes a response in the amygdala. This leads to renewed NA activity and lever pressing around t=6050 ms. In that instance, a contextual response from the hippocampus falls within a strongly hyperpolarized interval following the spike train in NA. The interval until subsequent activity in NA is therefore again determined by the conditioned stimulus response rate (around t=6500 ms). Around t=6800 ms, a hippocampal response results in NA activity, so that the average rate of drug-seeking lever pressing in the intact system is greater than the individual context or conditioned stimulus response rates. A similar pattern is repeated for the four trains of spiking activity in NA from t=6800 ms to t=7900 ms. The average frequency of drug-seeking lever presses is about 2.7 Hz.
Fig. 19: Drug-seeking lever pressing responses when the pathway from amygdala to NA is lesioned, while the pathway from the hippocampus remains intact. The responses in the amydala to the conditioned stimulus that correspons with lever-pressing are shown, but due to the precise temporary lesion their effect cannot reach the nucleus accumbens. The presence of the context stimulus causes repeated activity in the hippocampus at a rate of about 1.7 Hz. The consequent activity in NA supports drug-seeking lever pressing at a reduced rate compared to the intact system.
Fig. 18: Drug-seeking lever pressing responses when the pathway from hippocampus to NA is lesioned, while the pathway from the amygdala remains intact. Hippocampal responses to the context stimulus cannot reach NA. When a random lever press at t=13400 ms causes a conditioned stimulus response in the amygdala, repeated lever pressing ensues. Drug-seeking lever presses occur at the reduced rate determined by the delay in neuronal circuitry propagating activity to NA. The frequency of lever presses when the hippocampus is lesioned and the conditioned stimulus is presented is about 2.4 Hz (1.6 Hz over the entire interval of the stage) in this experiment.
Fig. 20: Consolidation of contextual associations is an ongoing process in experimental stages 1 through 5. The period required for consolidation is compressed to a mere 20 seconds in this experiment for the sake of computational efficiency and presentation. An adjustment of the learning parameters to reflect the true gradual nature of consolidation would result in a realistic time-span.
Fig. 21: Following consolidation of context associations in a neocortical memory store, responses are measured when the pathway from the hippocampus is lesioned and the pathway from the amygdala is intact. Responses in the neocortical region can make up for some of the lost input, so that the drug-seeking lever pressing rate is greater than that achieved in stage four of the experiment. The rate is similar to that in the intact system, with a drug-seeking lever presssing frequency of about 2.7 Hz.
Fig. 22: When both pathways from the hippocampus and the amygdala to NA are lesioned after consolidation of context associations, drug-seeking lever pressing in NA depends on responses arriving from the neocortical memory. The drug-seeking lever pressing rate then corresponds to the rate of neocortical context associative activity at about 2 Hz.
For an alternate presentation of the results, pathway lesions are transformed into lesions of the hippocampal and amygdala regions and the hippocampal region is not lesioned during the neocortical consolidation stage. This is implemented in the model stored as operant1.network-20030227.ccm.
Fig. 23: The complete experiment as in Fig. 15 with lesions of entire hippocampal and amygdala regions and without lesioning of the hippocampus during the stage that enables the completion of consolidation.