Operant
Reward Learning in Aplysia: Neuronal Correlates and Mechanisms
Björn
Brembs1, Fred D. Lorenzetti1, Fredy D. Reyes,
Douglas A. Baxter, 1
These authors contributed equally to this work.
Summary: Operant conditioning is a form of associative learning through which an animal learns about the consequences of its behavior. Here we report an appetitive operant conditioning procedure in Aplysia that induces long-term memory. Biophysical changes that accompanied the memory were found in an identified neuron (cell B51) that is considered critical for the expression of behavior that was rewarded. Similar cellular changes in B51 were produced by contingent reinforcement of B51 with dopamine in a single-cell analogue of the operant procedure. These findings allow for the detailed analysis of the cellular and molecular processes underlying operant conditioning. Learning about relationships between stimuli [i.e., classical conditioning; (1)] and learning about the consequences of one’s own behavior [i.e., operant conditioning; (2)] constitute the major part of our predictive understanding of the world. Although the neuronal mechanisms underlying appetitive and aversive classical conditioning are well studied (e.g., 3-8), a comparable understanding of operant conditioning is still lacking. Published reports include invertebrate aversive conditioning (e.g., 9-12) and vertebrate operant reward learning (e.g., 13). In several forms of learning, dopamine appears to be a key neurotransmitter involved in reward (e.g., 14). Previous research on dopamine mediated operant reward learning in Aplysia was limited to in vitro analogues (15-18). In this report, we overcome this limitation by developing both in vivo and single cell operant procedures and describe biophysical correlates of the operant memory. The in vivo operant reward learning paradigm was developed using the consummatory phase (i.e., biting) of feeding behavior in Aplysia. This model system has several features that we hoped to exploit. The behavior occurs in an all-or-nothing manner and is thus easily quantified (see Video). The circuitry of the underlying central pattern generator (CPG) in the buccal ganglia is well characterized (e.g., 19). The anterior branch of the esophageal nerve (En2, Fig. 1A) is both necessary and sufficient for effective reinforcement during in vivo classical conditioning and in vitro analogues of classical and operant conditioning (15-18, 20-23). Presumably, En2 conveys information about the presence of food during ingestive behavior. Consequently, we investigated the role of En2 in the reinforcement pathway by recording from it in freely behaving Aplysia via chronically implanted extracellular hook-electrodes (24) (see Methods 1) (Fig. 1A). Little nerve activity was observed during spontaneous biting in the absence of food (Fig. 1B1), whereas bouts (duration: ~3 s) of high frequency (~30 Hz) activity in En2 were recorded during the ingestion of food (Fig. 1B2). Specifically, this activity was observed in conjunction with ingestion movements of the odontophore/radula (a tongue like organ). Electrical stimulation of En2 might thus be used to substitute for food reinforcement in an operant conditioning paradigm. Therefore, in vivo stimulation of En2 at approximately the frequency and duration as observed during feeding was made contingent upon each spontaneous bite in freely behaving animals (see Methods 2). Such a preparation is unique in studies of learning in invertebrates and analogous to commonly used self-stimulation procedures in rats (e.g. 13).
One day after implanting the electrodes, animals were assigned to one of three groups: i) a control group without any stimulation, ii) a contingent reinforcement group for which each bite during training was followed by En2 stimulation, or iii) a yoked control group that received the same sequence of stimulations as the contingent group, but the sequence was uncorrelated with their behavior (25). Animals that had been contingently reinforced showed significantly more spontaneous bites during a five-minute test period than both control groups, regardless of whether they were tested immediately after training (Fig. 1C) or 24 h later (Fig. 1D). These results indicate that during ten minutes of contingent stimulation, the animals acquired an operant memory that lasted for at least 24 h. We next sought to identify changes in the nervous system that were associated with the behavioral modification. The neural activity that underlies the radula movements during feeding is generated by the buccal CPG. This neural network consists of sensory, inter- and motor neurons that continue to produce buccal motor patterns (BMPs), even when the ganglia are removed from the animal (15). In the intact animal, ingestion-like BMPs correspond to radula movements transporting food through the buccal mass into the foregut, as opposed to rejection-like BMPs that correspond to radula movements that remove inedible objects from the foregut (24). Buccal neuron B51 is pivotal for the selection of BMPs. Specifically, B51 exhibits a characteristic, sustained all-or-none level of activity (plateau potential) during ingestion-like BMPs. Moreover, B51 can gate transitions between BMPs: direct depolarization of B51 leads to the production of ingestion-like BMPs, whereas hyperpolarization inhibits ingestion-like BMPs (18). We thus examined whether the observed increase in number of bites was associated with an increase in excitability of B51. To test the hypothesis that B51 was a site of memory storage for operant conditioning, another set of animals was conditioned (26). Immediately after the last training period, the animals were anaesthetized, dissected and the buccal ganglia prepared for intracellular recording (see Methods 3). Resting membrane potential, input resistance, and burst threshold were measured in B51. Burst threshold was defined as the amount of depolarizing current needed to elicit a plateau potential (see also 16, 18). Cells from the contingent group exhibited a significant decrease in burst threshold (Fig. 2A) and a significant increase in input resistance (Fig. 2B) compared to cells from the yoked control. The resting membrane potential did not differ between the groups (27). The decrease in burst threshold and increased input resistance both increase the probability of B51 becoming active and thus increase the probability of a BMP to become ingestion-like. Our data validate an in vitro analogue of operant conditioning in isolated buccal ganglia (16) and extend the research to include operant conditioning in freely moving Aplysia. Although the expression of intrinsic changes in the membrane properties of B51 was associated with operant conditioning, the maintenance of these changes could be due to extrinsic factors such as a tonic change in modulatory input to B51. If so, the locus of the associative neuronal mechanism may be upstream of B51. Moreover, as B51 is active during ingestion-like BMPs, the changes in B51 could be the effect of repeated activation, rather than a cause of operantly conditioned animals producing more bites than yoked controls. To solve this question, we isolated the neuron in primary cell culture and developed a single-cell analogue of the operant procedure. B51 neurons were removed from naïve Aplysia and cultured (see Methods 4). Dopamine mediates reinforcement in an in vitro analogue of operant conditioning (17) and En2 is rich in dopamine-containing processes (28). Therefore, reinforcement was mimicked by a brief (6 s) iontophoretic “puff” of dopamine onto the neuron. Because B51 exhibits a plateau potential during each ingestion-like BMP, this reinforcement was made contingent upon a plateau potential elicited by injection of a brief depolarizing current pulse. Contingent reinforcement of such B51 activity in the ganglion with En2 stimulation is sufficient for in vitro operant conditioning (18). Two experimental groups were examined. Building on the experience with in vitro operant conditioning (18), we administered seven supra-threshold current pulses in a ten-minute period to a contingent reinforcement group. Dopamine was iontophoresed immediately after cessation of the plateau potential. An unpaired group received the same number of depolarizations and puffs of dopamine, but dopamine iontophoresis was delayed by 40 s after the plateau potential. Contingent application of dopamine produced a significant decrease in burst threshold (Fig. 3A) and a significant increase in input resistance (Fig. 3B). Apparently, processes intrinsic to B51 are responsible for the induction and maintenance of the biophysical changes associated with operant reward learning.
The combination of rewarding a simple behavior with physiologically realistic in vivo stimulation uncovered neuron B51 as one site where operant behavior and reward converge (see Discussion). The results presented here suggest that intrinsic cell-wide plasticity contributes to operant reward learning. Such cell-wide plasticity is also associated with operant conditioning in insects (10). Although B51 is a key element in the neural circuit for feeding, the quantitative contribution of the changes in B51 to the expression of the behavioral changes needs to be elucidated. Given the number of neurons in the feeding CPG (19), it is likely that B51 will not be the only site of plasticity during operant conditioning (nor will cell-wide plasticity likely be the only mechanism). However, the persistent involvement of contingent-dependent cell-wide plasticity in B51 in different levels of successively reduced preparations suggests an important role for this mechanism. Research on Aplysia has provided key insights into mechanisms
of aversive conditioning that are evolutionary conserved. The utility
of this model system for learning and memory has now been extended to
dopamine-mediated reward learning on the behavioral, network and cellular
level. Our study expands a growing body of literature that shows that
dopamine is an evolutionary conserved transmitter used in reward systems.
Future research on Aplysia will provide insights into the subcellular
effects of dopamine reward, an area currently under intense investigation
in vertebrates (e.g., 8, 13). References and Notes
|