Mushroom Bodies Regulate Habit Formation in Drosophila
Received 19 December 2008; revised 3 June 2009; Accepted 4 June 2009. Published online: July 2, 2009. Available online 2 July 2009.
To make good decisions, we evaluate past choices to guide later decisions. In most situations, we have the opportunity to simultaneously learn about both the consequences of our choice (i.e., operantly) and the stimuli associated with correct or incorrect choices (i.e., classically) . Interestingly, in many species, including humans, these learning processes occasionally lead to irrational decisions . An extreme case is the habitual drug user consistently administering the drug despite the negative consequences, but we all have experience with our own, less severe habits. The standard animal model employs a combination of operant and classical learning components to bring about habit formation in rodents  and  . After extended training, these animals will press a lever even if the outcome associated with lever-pressing is no longer desired . In this study, experiments with wild-type and transgenic flies revealed that a prominent insect neuropil, the mushroom bodies (MBs), regulates habit formation in flies by inhibiting the operant learning system when a predictive stimulus is present. This inhibition enables generalization of the classical memory and prevents premature habit formation. Extended training in wild-type flies produced a phenocopy of MB-impaired flies, such that generalization was abolished and goal-directed actions were transformed into habitual responses.
Author Keywords: SYSNEURO
A tethered fruit fly, Drosophila, in the absence of sensory information, continuously changes its choice of flight direction . Much as humans learn about the consequences of their actions, the fly's choices can be modulated by learning about the consequences of such decisions as when and where to turn (i.e., operant learning ). Procedurally, the task is not significantly altered by adding predictive stimuli to this task such that one (e.g., blue coloration of the environment) indicates which turning maneuvers (e.g., left turning) are punished with heat and the other (e.g., green coloration) indicates which decisions (e.g., right turning) are not punished; the decisions are still followed by the same consequences. However, this helpful indication of correct and incorrect choices drastically alters the biological processes underlying the task. Without the help of the colors, the flies require protein kinase C function, but not the rutabaga-encoded type I adenylyl cyclase, to learn to make the correct choice; with the colors, the results are reversed . In order to further investigate this dominant effect of the classical colors on operant learning, wild-type and transgenic flies were first trained in situations with both operant and classical components present and were then tested for each component individually ( Figure 1).
High-quality image (949K)
During training, heat is used to simultaneously condition flies both to avoid turning to one direction (right or left; operant component) and one of two colors (blue or green, classical component). In a subsequent test without heat, the flies' spontaneous preference is recorded. One group of flies is tested in the same situation as during training (1). A second group of flies is tested for the operant component in isolation by removing the classical component (2). A third group of flies is tested for the classical component by replacing the operant behavior controlling the colors with a novel behavior (see Supplemental Experimental Procedures and  for details). The red (operant + classical), green (operant component) and blue (classical component) color scheme applies to all subsequent figures.
Given the dominance of the colors in this paradigm, it may not come as a surprise that in a test without heat, after 8 min of such composite training, wild-type flies did not reveal any preference for left- or right-turning maneuvers if the helpful color filters were removed ( Figure 2A; i.e., the isolated operant component, situation 2 in Figure 1). Apparently, the colors inhibit operant learning. Why would the flies not learn an important predictor of punishment such as their own behavior? One hypothesis is that operant learning might lead to behavioral modifications, which in turn could potentially interfere with generalization of the classical color memory. Sensorimotor learning interfering with behavioral flexibility (“habit interference”) is a well-known phenomenon , and the balance between interference and transfer/generalization is a popular research topic . We tested for the generalization of the classical memory by measuring color preference in a situation where straight flight (as opposed to constant turning) was required to reliably avoid the previously punished color (i.e., situation 3 in Figure 1, previously described in ; see Supplemental Experimental Procedures available online for details). After 8 min of composite training, wild-type flies successfully avoided the punished color via this orthogonal behavior ( Figure 2A), after a brief reminder training . A commonly used experimental procedure to induce sensorimotor learning in other animals is overtraining  ,  and  . According to the hypothesis above that learning of the operant behavior in flies may be analogous to sensorimotor learning leading to habit formation in mammals, extended training in flies should overcome the inhibition of operant learning and lead to a failure to generalize the isolated classical memory to the novel behavior. Consistent with this hypothesis, flies that were trained with equivalent operant and classical predictors for twice the regular amount of time showed significant performance indexes (PIs) in the control and in the operant test and no significant score in the generalization test ( Figure 2B). Observing the behavior of the flies, it was noticeable that, after extended training, some flies seemed to generate larger turning maneuvers toward the previously unpunished direction, compared with more symmetrical maneuvers from flies exposed to the regular amount of training ( Figure 2C). A quantitative evaluation of the behavioral data tended to confirm the qualitative observations, but the number of animals was too low to reach statistical significance (data not shown). Taken together, these results indicate that adjusting to the novel situation after extended composite training is difficult enough to disrupt performance in the generalization task (habit interference). Similar to a rodent pressing a lever for an aversive stimulus  ,  and  , the fly, also only after extended training, keeps generating behaviors that interfere with avoiding the previously punished color.
High-quality image (637K)
(A) Standard 8 min training in wild-type (WT) flies. Whereas there is significant composite learning (red: t31 = 5.1, p < 0.001), the score for the isolated operant component does not reach significance (green: t24 = −0.3, p < 0.8; not even after a 60 s reminder training, data not shown). However, there is significant transfer of the classical color memory to a novel behavior (blue: t19 = 3.1, p < 0.01) indicates successful generalization.
(B) Extended 16 min training reverses the scores for the isolated components. The longer training duration does not lead to an overtraining decrement (t16 = 2.8, p < 0.013). Testing for the operant component shows a release from the inhibition of operant learning (t16 = 2.6, p < 0.02). Without inhibition of the operant system, the flies are unable to generalize (t19 = 0.1, p < 0.91).
(C) Example raw data traces from the generalization test (situation 3 in Figure 1). Data from two wild-type flies during the test period depicted in (A) and (B). The red traces depict the turning maneuvers (yaw torque) used to change flight direction (blue trace, pattern position) and hence coloration of the environment (background color of the graph). Upper traces: fly after 8 min of training to turn right and avoid green color (pooled data in A). Lower traces: fly after 16 min of training to turn left and avoid blue color (pooled data in B). Whereas the fly trained for the regular amount of time shows symmetrical turning maneuvers, the fly trained for an extended period of time shows left turning yaw torque spikes (discrete turning maneuvers) of uniformly larger amplitude than its right turning yaw torque spikes (traces enlarged in Figure S1). Numbers at bars: number of animals. ∗Significant difference from zero. Error bars are SEM.
In order to elucidate the neuronal substrates mediating these processes, specific neuronal ensembles in the fly's brain were silenced. Because previous evidence pointed toward the MBs being involved in specific generalization processes  ,  ,  and  , this neuropil was targeted with the UAS-GAL4 system to block synaptic output by expressing the bacterial tetanus toxin light chain . The first P[GAL4] driver line was MB247, because this line has already seen widespread use as an MB-specific driver line  ,  ,  and  . MB247 drives expression in about 1600 of the ∼2000 Kenyon cells in all parts of the MB, except the prime lobes, and in some neurons of the central complex  and  . The heterozygous control crosses of driver and effector strains with Canton S wild-type strains reproduced wild-type behavior ( Figure 3A). Flies with impaired MB function can learn both the colors and how to modulate their turning movements  and  . Confirming these previous results, flies with tetanus toxin expression driven by line MB247 could master the composite learning task composed of these two predictors ( Figure 3B, situation 1). However, in a phenocopy of the wild-type flies after extended training, flies with such blocked MB output did not generalize the classical memory to a novel behavior and showed significant operant learning already after the regular 8 min of training ( Figure 3B, situations 3 and 2, respectively). Thus, with such manipulated MB function, flies appear to form habits prematurely.
High-quality image (275K)
(A) The genetic control flies (the two heterozygote strains did not differ and were pooled) reproduce the wild-type results: significant composite learning (t26 = 3.8, p < 0.001), inhibition of the operant component (t31 = 0.7, p < 0.5), and successful generalization of the isolated classical component (t14 = 2.7, p < 0.05).
(B) Flies with blocked MB output constitute a phenocopy of the wild-type flies with extended training, already after 8 min of training. They perform well in composite learning (red: t19 = 3.1, p < 0.01), but do not inhibit the operant component during composite training (green: t18 = 2.6, p < 0.05). Without inhibition of the operant system, these transgenic flies are unable to generalize the isolated classical component to a novel behavior (blue: t20 = −0.5, p < 0.6).
(C) Specificity of the MB effects is provided by expressing TNT in the fan-shaped body. These flies behave as wild-type and control heterozygote flies with significant composite learning (t11 = 4.3, p < 0.002) and inhibition of the operant system (t16 = 0.4, p < 0.7), which in turn allows for a successful generalization of the classical component to a novel behavior (t20 = 2.7, p < 0.014).
(D) Flies with blocked output only from the α and β lobes of the MB mimic the flies expressing tetanus toxin in all MB lobes. They perform well in composite learning (t13 = 4.3, p < 0.001), do not inhibit the operant system (t13 = 3.1, p < 0.01), and do not generalize (t16 = −0.38, p < 0.71). Numbers at bars: number of animals. ∗Significant difference from zero. Error bars are SEM.
Flies in which the P[GAL4] line c205 drives expression of a constitutively active G-Protein are defective in visual pattern discrimination learning . Constitutive expression of tetanus toxin in the F5 neurons in the fan-shaped body of the central complex via the line c205 confirmed that the effects of tetanus toxin expression were specific to the MB: in contrast to the MB247 flies, these flies behaved similarly to wild-type and genetic control flies ( Figure 3C). In order to investigate which of the MB lobes are responsible for the inhibition of operant learning in such composite situations, transgenic flies with the P[GAL4] driver line 17D, which drives toxin expression mainly in the MB α and β lobes (core and surface) but not in the γ lobes  and  , were subjected to the same procedure. These flies show the same pattern of PIs as the MB247 flies: significant PIs in the control and in the purely operant test and no significant score in the generalization test ( Figure 3D), conclusively tying the inhibition of the operant component to MB neurons. Moreover, we can tentatively conclude that the MB γ lobes are probably not involved in this process.
Spontaneous behavior has clear fitness benefits . However, spontaneous behavioral variation may reduce efficiency by introducing mistakes. The success of an animal thus depends on finding the right balance between efficient exploitation of known resources through routine behavior and flexible exploration of possible new resources through novel behaviors (the exploitation-exploration dilemma  and  ). In a new situation, such as the operant paradigm used here, the animal explores the environment via spontaneous behaviors . It learns about the stimuli in this environment and how they relate to each other primarily by engaging the classical learning system . During this phase, the classical learning system inhibits the operant system via the MB, preventing direct modification of the behavior of the animal and keeping the memory flexible ( Figure 2 and Figure 3). After extended periods of time in this situation, the MB-mediated inhibition is overcome and the behavior is modified by the operant learning system, which may improve efficiency but also leads to inflexibility ( Figure 2B). The current data allow establishing a mechanistic model of how operant and classical learning systems may interact in composite learning situations and which biological substrates mediate these processes ( Figure 4). In this view, the Rutabaga adenylyl cyclase-dependent classical learning system inhibits the protein kinase C-dependent operant learning system via the MB. The operant learning system facilitates classical learning via still unknown, non-MB pathways (data not shown and ). This interaction leads to efficient learning, enables generalization, and prevents premature habit formation. In flies, it is not yet known whether the two learning systems are also separable anatomically. It is tempting to speculate that the interactions between the two learning systems are part of the mechanism achieving the balance between exploration and exploitation. In this hypothesis, the MBs provide the checks and balances to ensure that habits are formed only if their efficiency outweighs their disadvantage of being inflexible.
High-quality image (253K)
In learning situations where the animal has the possibility to simultaneously learn about relationships between stimuli in the world and about the consequences of its own behavior, two learning systems can be engaged. One learning system learns about the world (classical learning system), and the other system learns to modify behavior (operant learning system). The AC-dependent classical learning system inhibits PKC-dependent operant learning via the mushroom bodies (MBs). Operant behavior controlling predictive stimuli facilitates learning about these stimuli by the classical learning system via unknown, non-mushroom-body pathways. These interactions lead to efficient learning, generalization and prevent premature habit-formation. AC, adenylyl cyclase; PKC, protein kinase C.
Such an MB function would be distinct from the one that the MBs are known to serve in olfactory classical conditioning. The current consensus is that the memory trace formed during this kind of learning lies within the MB Kenyon cells  ,  and  . This is clearly not the case for visual learning, where the MBs are not essential  and  . Instead, specific features of the conditioned stimulus in visual learning appear to reside in distinct layers of the fan-shaped body of the central complex . For visual learning, the MBs appear to keep classical memories flexible for use when the fly's situation changes. If the fly's sensory situation changes, this feature supports context generalization  and  and protects against sensory conflict  and  . If the fly's behavioral situation changes, this feature supports the form of generalization described here. From these accumulating recent results, it appears that the inhibitory function of the MB may be much more pervasive than previously thought. It is a tantalizing finding for all Drosophila learning and memory research that overtrained wild-type flies behave indistinguishably from flies with blocked MB output: whenever the neural substrate of a learning task is studied, the question of whether the training regime constitutes overtraining must now also be considered. This is reminiscent of vertebrate experiments, where the dorsal striatum and the hippocampus are viewed as competing learning systems with the dorsal striatum involved in skill-learning and the hippocampus in fact-learning  and  . Short training is primarily processed by the hippocampus, whereas prolonged training recruits the dorsal striatum. Interestingly, if the prelimbic medial prefrontal cortex is lesioned in rats, even short training leads to habit formation , reminiscent of the flies with blocked MB output. To my knowledge, habit formation has never been shown in any invertebrate model system before. This discovery entails that models for addiction and other compulsive disorders can now also be developed in the fly.
Combining the tools developed in the approach of localizing memory traces  and  with the experimental separation of operant and classical learning components , Drosophila has now entered the stage where we can start to unravel not only where memories are stored but also how and where basic neural subsystems interact to accomplish efficient learning in more ethologically relevant situations, without compromising generalization or prematurely engaging habit formation. Research on Drosophila has provided key insights into mechanisms of classical learning that are evolutionary conserved. The utility of this model system has now been extended to the study of complex learning situations comprising multiple, interacting learning systems on the behavioral, circuit, and genetic level. These studies expand a growing body of literature that simultaneously engaged memory systems can act both cooperatively and antagonistically.
The author is grateful to J. Colomb, M. Heisenberg, R. Wolf, and R. Menzel for critical discussions and comments on earlier versions of the manuscript. Fly strains were generously provided by M. Heisenberg, H. Tanimoto, and S. Waddell. T. Franke created the 3D renderings of the experimental setup with PovRay. The author is especially indebted to J. Pflüger for financial support and for providing laboratory space, advice, and encouragement in times of great need.
1 B. Brembs and W. Plendl, Double dissociation of PKC and AC manipulations on operant and classical learning in Drosophila. Curr. Biol., 18 (2008), pp. 1168–1171.
2 M.J. Frank, Slave to the striatal habit (Commentary on Tricomi et al.). Eur. J. Neurosci., 29 (2009), pp. 2223–2224.
3 B.W. Balleine, Incentive processes in instrumental conditioning, R.M.S. Klein, Editor, Handbook of Contemporary Learning Theories, LEA, Hillsdale, NJ (2001), pp. 307–366.
4 H.H. Yin and B.J. Knowlton, The role of the basal ganglia in habit formation. Nat. Rev. Neurosci., 7 (2006), pp. 464–476.
5 M.R. Hilario and R.M. Costa, High on habits. Front. Neurosci., 2 (2008), pp. 208–217.
6 A. Maye, C.-h. Hsieh, G. Sugihara and B. Brembs, Order in spontaneous behavior. PLoS ONE, 2 (2007), p. e443.
7 R. Wolf and M. Heisenberg, Basic organization of operant behavior as revealed in Drosophila flight orientation. J. Comp. Physiol. A., 169 (1991), pp. 699–705.
8 W.S. Hunter, Habit interference in the white rat and in the human subject. J. Comp. Psychol., 2 (1922), pp. 29–59.
9 J.W. Krakauer, P. Mazzoni, A. Ghazizadeh, R. Ravindran and R. Shadmehr, Generalization of motor learning depends on the history of prior action. PLoS Biol., 4 (2006), p. e316.
10 B. Brembs and M. Heisenberg, The operant and the classical in conditioned orientation in Drosophila melanogaster at the flight simulator. Learn. Mem., 7 (2000), pp. 104–115.
11 L. Liu, R. Wolf, R. Ernst and M. Heisenberg, Context generalization in Drosophila visual learning requires the mushroom bodies. Nature, 400 (1999), pp. 753–756.
12 B. Brembs and J. Wiener, Context generalization and occasion setting in Drosophila visual learning. Learn. Mem., 13 (2006), pp. 618–628.
13 S. Tang and A. Guo, Choice behavior of Drosophila facing contradictory visual cues. Science, 294 (2001), pp. 1543–1547.
14 K. Zhang, J.Z. Guo, Y. Peng, W. Xi and A. Guo, Dopamine-mushroom body circuit regulates saliency-based decision-making in Drosophila. Science, 316 (2007), pp. 1901–1904.
15 S.T. Sweeney, K. Broadie, J. Keane, H. Niemann and C.J. O'Kane, Targeted expression of tetanus toxin light chain in Drosophila specifically eliminates synaptic transmission and causes behavioral defects. Neuron, 14 (1995), pp. 341–351.
16 Y. Aso, K. Grübel, S. Busch, A.B. Friedrich, I. Siwanowicz and H. Tanimoto, The mushroom body of adult Drosophila characterized by GAL4 drivers. J. Neurogenet., 23 (2009), pp. 156–172.
17 T. Zars, M. Fischer, R. Schulz and M. Heisenberg, Localization of a short-term memory in Drosophila. Science, 288 (2000), pp. 672–675.
18 R. Wolf, T. Wittig, L. Liu, G. Wustmann, D. Eyding and M. Heisenberg, Drosophila mushroom bodies are dispensable for visual, tactile and motor learning. Learn. Mem., 5 (1998), pp. 166–178.
19 G. Liu, H. Seiler, A. Wen, T. Zars, K. Ito, R. Wolf, M. Heisenberg and L. Liu, Distinct memory traces for two visual features in the Drosophila brain. Nature, 439 (2006), pp. 551–556.
20 N.D. Daw, J.P. O'Doherty, P. Dayan, B. Seymour and R.J. Dolan, Cortical substrates for exploratory decisions in humans. Nature, 441 (2006), pp. 876–879.
21 J.D. Cohen, S.M. McClure and A.J. Yu, Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philos. Trans. R. Soc. Lond. B Biol. Sci., 362 (2007), pp. 933–942.
22 B. Gerber, H. Tanimoto and M. Heisenberg, An engram found? Evaluating the evidence from fruit flies. Curr. Opin. Neurobiol., 14 (2004), pp. 737–744.
23 R.L. Davis, Olfactory memory formation in Drosophila: From molecular to systems neuroscience. Annu. Rev. Neurosci., 28 (2005), pp. 275–302.
24 D.B. Akalal, C.F. Wilson, L. Zong, N.K. Tanaka, K. Ito and R.L. Davis, Roles for Drosophila mushroom body neurons in olfactory learning and memory. Learn. Mem., 13 (2006), pp. 659–668.
25 A.S. Lee, R.S. Duman and C. Pittenger, A double dissociation revealing bidirectional competition between striatum and hippocampus during learning. Proc. Natl. Acad. Sci. USA, 105 (2008), pp. 17163–17168.
26 S. Killcross and E. Coutureau, Coordination of actions and habits in the medial prefrontal cortex of rats. Cereb. Cortex, 13 (2003), pp. 400–408.
27 B. Brembs, Operant learning of Drosophila at the torque meter. J. Vis. Exp., (2008).