Prevalence of a sensory over a behavioral predictor in Drosophila yaw torque learning

Björn Brembs, PhD

Abstract

The weights and relationship of operant and classical components in associative learning are investigated. Fixed flying Drosophila at the torque meter modulate their yaw torque over a wide range attempting to turn right and left. A behavior (B, the domain of yaw torque corresponding to left or right turns) and a stimulus (CS, one of two colors) are arranged to coincide with reinforcement (US, heat). This composite B+CS+US training is more effective than pure B+US training in which only the yaw torque domain is coupled to heat without a color cue. The individual contributions of the CS-US (classical) and the B-US (operant) associations in the composite training are assessed by disjoining the components after training and testing them separately. Only the classical but not the operant association is detectable, excluding the possibility that operant and classical associations are equivalent. However, rearranging the contingencies between colors and yaw torque in the test reveals that the behavior has entered into the association with the US during the composite training but is accessible only as B-CS compound in the subsequent memory test.

Introduction

Comparing operant (Skinner, 1938) and classical (Pavlov, 1927) conditioning has intrigued behavioral scientists throughout the 20th century (e.g. Skinner, 1935; Konorski and Miller, 1937b; Konorski and Miller, 1937a; Skinner, 1937; Gormezano and Tait, 1976; Donahoe et al., 1993; Rescorla, 1994; Donahoe, 1997; Donahoe et al., 1997; Brembs and Heisenberg, 2000). However, our knowledge about the relationship between the two is still rather poor. This is largely due to the fact that the animal learns operantly and classically at the same time. Most learning situations comprise three term contingencies (Skinner, 1938) of one or more initially neutral stimuli (conditioned stimuli, CSs), the animal’s behavior (B) and the reinforcer (unconditioned stimuli, USs). We believe that the experiments described here are the first not to suffer from this formerly ‘inevitable’ conglomeration.

In this study, tethered Drosophila suspended at a torque meter is used to compare the associations formed in a purely operant task (yaw torque learning; Wolf and Heisenberg, 1991) and a more complex three term contingency (switch mode learning; Brembs and Heisenberg, 2000). The two designs are very similar: in yaw torque learning the animals learn to restrict their spontaneous yaw torque range (B) in order to avoid being heated (US). No sensory stimuli are contingent upon reinforcement. In switch mode learning, yaw torque (B) still controls the US, but in addition the coloration of the fly’s visual surround is switched between green and blue (CS) as the heat is switched on or off. Thus, while in yaw torque learning only a B-US association enables the fly to avoid the heat, in switch mode with all three components present (B, CS and US) the fly may form B-US as well as CS-US associations.

It is important to note that, unlike in previous studies (e.g. Williams, 1975; Pearce and Hall, 1978; Williams, 1978; St. Claire-Smith, 1979; Williams et al., 1990; Hammerl, 1993) in switch mode learning the CS and the B have the same immediate relation to the reinforcer. Both 'predict' the reinforcer equally well. Therefore, assessing the associative strength of each component individually after conditioning both simultaneously as a compound will tell us the relative contributions of operant (B-US) and classical (CS-US) associations in the complex learning situation. This is accomplished by first training the B+CS compound in switch mode and then testing the elements B and CS separately. The strength of the B-US association is assessed by omitting the color (CS) in the test (as in yaw torque learning). Without the CS, there are no sensory cues available anymore that could elicit the correct yaw torque modulation. Any significant learning must then be attributed to the operant B-US component. Correspondingly, the classical CS-US association is assessed by preventing the B-US component to contribute to the test. This is achieved by changing the behavioral paradigm between training and test. The fly is made to control arena coloration not any longer directly by its yaw torque domains, but by its flight direction (flight simulator; Wolf and Heisenberg, 1997). In the flight simulator the fly is surrounded by a rotating drum with visual landmarks. The angular velocity of this artificial panorama is proportional to, and directed against the fly’s yaw torque. This enables the fly to choose its direction of flight and with it the coloration of its visual surround. A correct choice of colors with this new behavior must be due to a behavior independent, i.e. classical CS-US component. Comparing the learning scores of these two tests with the switch mode control situation will reveal the contributions of the single associations to the composite learning task.

Materials and Methods

Flies. Flies are kept on standard cornmeal/molasses medium (recipe see Guo et al., 1996) at 25°C and 60% humidity with a 16hr light/8hr dark regime. 24-48h old female flies are briefly immobilized by cold-anesthesia and glued (Locktite UV glass glue) with head and thorax to a triangle-shaped copper hook (diameter 0.05mm) the day before the experiment. The animals are then kept individually overnight in small moist chambers containing a few grains of sucrose.

Apparatus. The core device of the set-up is the torque compensator (torque meter). Originally devised by Götz (1964), it measures a fly's angular momentum around its vertical body axis, caused by intended flight maneuvers. The fly, glued to the hook as described above, is attached to the torque meter via a clamp to accomplish stationary flight in the center of a cylindrical panorama (arena, diameter 58mm), which is homogeneously illuminated from behind (see figure). The light source is a 100W, 12V tungsten-iodine bulb. For green and blue illumination of the arena, the light is passed through monochromatic broad band Kodak Wratten gelatin filters (#47 and #99, respectively). Filters can be exchanged by a fast solenoid within 0.1 sec. An analogue to digital converter card (PCL812; Advantech Co.) feeds the yaw torque signal into a computer which stores the signal (sampling frequency 20Hz) for later analysis. The reinforcer is a light beam (diameter 4mm at the position of the fly) generated by a 6V, 15W Zeiss microscope lamp, filtered by an infrared filter (Schott RG780, 3mm thick) and focused from above on the fly. Heat at the position of the fly is applied using a computer-controlled shutter intercepting the beam (Fig. 1).

Yaw torque learning. The fly’s spontaneous yaw torque range is divided into a ‘left’ and ‘right‘ domain (approximately corresponding to either left or right turns; for a justification of this assumption see: Heisenberg and Wolf, 1993). During training, heat is applied whenever the fly's yaw torque is in one domain and switched off by the shutter when the torque passes into the other (henceforth: yaw torque sign inversion). There are no patterns on the arena wall, but the illumination is spectrally restricted by a blue-green filter (Schott BG18, glass, 3mm) as it was used by (Liu et al., 1999) to allow for context generalization. In the yaw torque learning test phases, heat is permanently switched off and the fly’s choice of yaw torque domains is recorded (see Wolf and Heisenberg, 1991).

Switch mode learning: This is an extension of yaw torque learning. As in yaw torque learning, the fly is heated whenever the fly’s yaw torque passes into the domain associated with reinforcement. During yaw torque sign inversion not only temperature but also arena coloration is changed (from green to blue or vice versa). In the switch mode test phases, heat is permanently switched off and only the fly’s choice of yaw torque domains is recorded as arena illumination is always switched upon yaw torque sign inversion (see Brembs and Heisenberg, 2000).

Flight simulator: In order to give the animal the choice of angular orientations, the arena is patterned with 20 evenly spaced stripes (pattern wavelength lambda=18°). Via the motor control unit, an electric motor rotates the arena so that its angular velocity is proportional to, but directed against, the fly’s yaw torque (coupling coefficient K=-11°/s per 10-10Nm of yaw torque). This enables the fly to stabilize the rotational movements of the panorama and to control its angular orientation. The angular position of an arbitrarily chosen point of reference on the arena wall delineates a relative 'flight direction' of 0-360°. Flight direction (arena position) is recorded continuously via a circular potentiometer (Novotechnik, A4102a306, 10kW, Germany) and stored in the computer memory together with yaw torque (sampling frequency 20Hz). A computer program divides the 360° of the arena into four virtual 90° quadrants. The color of the illumination of the whole arena is changed whenever one of the virtual quadrant borders passes the frontal midline (i.e. flight direction) of the fly. During test phases, the fly’s choice of arena orientations that lead to green and blue illumination are recorded without reinforcement. During the 60s reminder training, heat reinforcement (input voltage 6.0V) is contiguous with either green or blue illumination of the arena (see Wolf and Heisenberg, 1997; Brembs and Heisenberg, 2000).

Experimental Design. After 8 minutes of training in switch mode, four different 2 minute tests are performed with the heat permanently switched off (Table. 1): (1) No change is made after training, and the animals continue to control the arena coloration by restricting their yaw torque (Fig. 1d; control). (2) The CS-US association alone is tested with a different operant behavior: flight simulator (Fig. 1e+g; colors alone; data from Brembs and Heisenberg, 2000). (3) The B-US association alone is tested by replacing the color filters with the blue-green Schott BG18 filter (i.e. a yaw torque learning test). There are no external cues contingent upon yaw torque sign inversion any more (Fig. 1f+h, torque alone). (4) The contingency between yaw torque range (B) and arena coloration (CS) is reversed, i.e. if during training the ‘left’ yaw torque domain leads to blue arena coloration, it leads to green coloration in the test and vice versa (Fig. 1i, exchanged). This procedure combines the two previous tests (2) and (3) in a within-animal design. In that way, the contribution of each of the two associations is measured while the other is present but in a (compared to the training) conflicting contingency (see Table 1). If one of the two associations is directly dominant over the other, this test should yield a learning score significantly different from zero.

Two additional controls were carried out: To control for context dependent learning of the B-US association, the sequence of coloration switches of the switch mode control group is played back to a naive group (Fig. 1b). During this replay, the flies can control the appearance of the US by restricting their yaw torque (yaw torque learning with replayed colors), but the coloration of the arena bears no temporal relation to the fly’s behavior. To control for the amount of possible interference of this treatment with the yaw torque learning process, a standard yaw torque learning experiment (see Wolf and Heisenberg, 1991) is conducted (Fig. 1a).

Data evaluation. The color or yaw torque domain preference of individual flies is calculated as the performance index: PI=(ta-tb)/(ta+tb). During training, tb indicates the time the fly is exposed to the reinforcer and ta the time without reinforcement. Mean avoidance scores are calculated by averaging over all four training PI’s. During tests, ta and tb refer to the times when the fly choose the formerly (or subsequently) unpunished or punished situation, respectively. Memory test PI’s are called ‘test PIs’ or ‘learning scores’.

Statistics. Tests for normal distribution of performance indices yield varying results. Therefore, non-parametric tests are used; i.e. a Mann-Whitney U-test for comparing two independent samples and a Wilcoxon matched pairs test to test single performance indices against zero. Comparisons of two groups consisting of several variables are carried out using a repeated measures ANOVA.

Results

Training Drosophila in the switch mode paradigm in which heat reinforcement (US) is contingent upon one of two colors (CS) and upon one domain of the yaw torque range (B), and then testing for the B-US and CS-US associations separately does not yield significant learning scores (CS: Fig. 1e, p=0.615; B: Fig. 1f, p=0.706; Wilcoxon matched pairs test). One could conclude that the two elements are bound together during the conditioning procedure and are not considered appropriate predictors of reinforcement separately. It may also be, however, that the drastic change in the situation (context) after the training prevents retrieval. We therefore introduce a reminder training in the new situation. During this brief (60s) reminder training phase the animal is heated either on the color (Fig. 1g) or on the torque range (Fig. 1h) that was previously associated with heat. Control experiments verified that 60s of reminder training by themselves are not sufficient to significantly condition the fly (data not shown). After the reminder in the new situation, only the CS alone test (Fig. 1g) but not the test for B alone (Fig. 1h) shows a significant PI (CS alone: p<0.005, Wilcoxon matched pairs test; B alone: p=0.141, Wilcoxon matched pairs test). This different outcome is significant because the transfer from switch mode to yaw torque learning should be even less dramatic than the transfer from switch mode to flight simulator. Moreover, the color replay experiment in Fig. 1b shows that just omitting color after the training in yaw torque learning has no deleterious effect on memory retrieval. Thus, the CS-US association can be shown independently of the behavior during which it was formed whereas no independent B-US association is observed after training in the three term contingency.

It is important to emphasize that during switch mode training the flies could ignore the CS and still do very well. Indeed the learning score in yaw torque learning (Fig. 1a) is not significantly different from the switch mode control (see below). Nevertheless, flies learn to discriminate the visual cues, which is demonstrated by the transfer to the flight simulator (Fig. 1g; data from Brembs and Heisenberg, 2000). Even more surprisingly, this learning of the visual cues blocks the display of the B-US association that would be available if the visual cues were not related to the fly’s behavior (Fig. 1b; p<0.05, Wilcoxon matched pairs test). Apparently, contingent reinforcement is not always sufficient for B-US associations to form.

The question remains whether in the composite training the B-US association is not formed at all, or is not independent of the concomitantly formed CS-US association. The critical test to decide between these alternatives is to reverse the contingencies between yaw torque range and color (Table 1; Fig. 1i). The data are presented such that positive learning scores indicate a dominance of stimuli over behavior and negative scores the opposite. If no B-US association were formed the PI should be the same as in the switch mode control group (Fig. 1d). The two values are highly significantly different (p<0.001, Mann-Whitney U-test), indicating an association between the behavior and the US, which, however, is revealed only as an interaction with the CS. The tendency for yaw torque to dominate over colors in the reversed contingency fails to reach statistical reliability (p=0.085, Wilcoxon matched pairs test).

As the color replay treatment (Fig. 1b) does not lead to significantly larger or smaller PIs than regular yaw torque learning (Fig. 1a; mean avoidance: p=0.165, learning: p=0.871; Mann-Whitney U-test), these data are pooled and compared to the switch mode control group (Fig. 1d). Although there is a tendency for switch mode to yield higher PI’s than yaw torque learning, this effect fails to reach significance (p=0.121, Mann-Whitney U-test). The difference in mean avoidance, however, is significant (p<0.03, Mann-Whitney U-test). A repeated measures ANOVA over both groups and both variables suggests that switch mode is more effective than yaw torque learning because it requires less training and generates higher learning scores (p<0.04).


Fig. 1: Dissociating behavioural and sensory predictors. a - Yaw torque learning, arena coloration BG18. N=30. b - Yaw torque learning with arena coloration recorded from the flies used in c and played back for the first 14 minutes of the experiment. The last test was performed using BG18 as constant colour filter. N=30. c - Pooled sw-mode data of all flies tested for individual associations. The final 2 minute test periods of the subgroups in this experiment are depicted in d-i. N=250. d - Sw-mode control. N=70. e - Fs-mode test for colour learning in fs-mode. No reminder training. N=22. f - Test for yaw torque learning. The colour filters have been replaced by a BG18 filter. No reminder training. N=73. g - Fs-mode test for colour learning. 60s of fs-mode reminder training (rt) after sw-mode training prior to testing (not shown). N=23. h - Test for torque learning. 60s of reminder training (rt) with BG18 after sw-mode training prior to testing (not shown). N=30. i - Exchanged predictors. Colours and yaw torque domain contingencies have been reversed. Reversal was such that positive scores would indicate correct choice of colours and negative scores correct yaw torque modulation. N=32. Statistics were performed as Wilcoxon matched pairs tests against zero: *** - significant at p<0.001; ** - significant at p<0.01. Hatched bars - training, open bars - test. Error bars are S.E.M.s of N flies.

Discussion

Switch mode learning, an extended version of yaw torque learning was analyzed as a composite learning task to assess the contributions of operant and classical components. For one, the composite task turned out to be more effective than the purely operant yaw torque learning, i.e. it required less reinforcement for similar learning scores. Second, dissociating the different possible associations formed during switch mode training revealed that the B-US association alone is not retrievable after composite training. Third, inversion of the B-CS relationship after training disrupted performance.

(1) It was shown previously that in pattern discrimination learning at the flight simulator purely classical pattern learning (only CS-US; Wolf et al., 1998) is less effective than the composite task (B, CS and US; Wolf and Heisenberg, 1991; Brembs and Heisenberg, 2000). The present result is a further example of a two term contingency (yaw torque learning) being less effective than the corresponding three term contingency (switch mode learning; Fig. 1). This is evolutionarily reasonable, as most natural learning situations comprise a three term contingency.

(2) Why is switch mode training so effective? The higher learning score compared to the simple task suggests that both B-US and CS-US associations are formed and add their associative strengths. This, however, is not what happens. By adding the CS to a yaw torque learning experiment, one suppresses the B-US association that would otherwise have occurred (Fig. 1f, h). This effect is reminiscent of a series of experiments (Williams, 1975; Pearce and Hall, 1978; Williams, 1978; St. Claire-Smith, 1979; Williams et al., 1990; Williams, 1999) in which rats had to press a lever several times (B) to obtain a reward (US) after a certain delay (or pigeons had to peck a keylight). Acquisition of the B-US association (bar pressing or key pecking) was reduced when each reinforcement was signaled by a stimulus (CS) during the delay period. We note that in Drosophila, as in vertebrates, contingent reinforcement of a behavior may not be sufficient for B-US associations to form. However, conclusions about the interactions of operant and classical processes are difficult to draw from the vertebrate preparation.

It is important to emphasize that in switch mode learning the CS, the B and the US are strictly coupled. If the CS is present but not in a predictive relation to the US, no suppression of the B-US association occurs (Fig. 1b). This control is important also for a different reason. (Liu et al., 1999) have shown that colors are powerful context cues for the fly and certain context changes between training and test abolish memory. The experiment of Fig. 1b demonstrates that a B-US association can be observed after a color change. Hence, in the experiments of Fig. 1f, h it is not the color change which suppresses memory recall.

(3) If operant and classical processes do not add up during switch mode training, what associations are formed? Assume no B-US association were formed during switch mode learning, i.e., after training in switch mode the fly relied entirely on the color cue and had no recollection of the behavioral modification it used to escape the heat. Then one would expect a reversal of the contingencies between yaw torque range and colors in the final test to have little impact on the learning scores. The fly now would restrict its yaw torque range to the other domain in order to avoid the color previously contingent with heat. However, neither a color nor a yaw torque domain seem particularly preferred. If at all, a (non-significant) tendency to stick to the previous yaw torque domain and to disregard the colors can be observed (Fig. 1i). Why is the color memory sufficient to induce the correct yaw torque/color choice in one contingency (Fig. 1d) but not in the other (Fig. 1i)? If the above conclusion holds that no separate B-US association is formed, we have to assume that in addition to the CS-US association a three term association between color, yaw torque domain and heat is formed during switch mode learning. In other words, there appears to be an association between the behavior and the US, but it also includes the CS. Therefore the flies do not exhibit a significant color or yaw torque preference in the reversed contingency, revealing the behavioral part of the three term association.

It is still an open question how colors and yaw torque modulations interact to accomplish the facilitated learning. In the composite training the behavioral modification seems 'color-tagged' such that without the color it is not retrieved. In this respect it is of interest to note that (Guo et al., 1996) have reported a B-CS interaction in pattern discrimination learning at the flight simulator. They reported that the duration of the pre-test and the performance in the final learning test are positively correlated if the coupling coefficient in the flight simulator is increased above the normal value implying that the task to stabilize the panorama is more difficult than usual. This observation suggests that during the pre-test the flies learn about the consequences of their yaw torque on pattern motion and orientation and that this 'skill' gives them an advantage in the subsequent heat avoidance task. Likewise, in the present study, learning about the B-CS relations has a positive impact on learning performance as the comparison between yaw torque and switch mode learning shows.

Our new results are a significant extension and generalization of our previous findings regarding the relationship between operant and classical associations in flight simulator learning, where operant behavior was shown to facilitate classical learning but no B-US association was detectable as in the present case (Brembs and Heisenberg, 2000). In both paradigms the advantage of composite over single association tasks is not due to additive processes but to the formation of more complex (B-CS-US) associations. Once one of the predictive components in the complex is altered, it takes special treatment (reminder training) to reveal the remaining single associations. The amount of reminder training required may vary with the component. In the present study, the classical CS-US component needed little reminder training whereas for the operant B-US component the reminder training offered was not enough. For the first time since Skinner, Miller and Konorski’s initial discussion of the problem (Skinner, 1935; Konorski and Miller, 1937b; Konorski and Miller, 1937a; Skinner, 1937), solid experimental evidence against the equivalence of operant and classical processes could be gathered.

At the torque meter, classical associations are more readily formed if the CS is under operant control. On the other hand, lasting behavioral modifications, as the result of operant associations, are only observed if no adequate sensory predictors are available. It would be surprising if these rules were a specialty of Drosophila at the torque meter. Perhaps modifications of behavioral control leave less flexibility than modifications of sensory processing.

Acknowledgements

We thank R. Wolf for invaluable discussions and technical support, and S. Clemens-Richter for stock keeping. The work was supported by the Deutsche Forschungsgemeinschaft (He 986) and Fonds der Chemischen Industrie.

References:
  • Brembs, B. and Heisenberg, M. 2000. The operant and the classical in conditioned orientation in Drosophila melanogaster at the flight simulator. Learn. Mem. 7: 104-115. (PDF)
  • Donahoe, J.W. 1997. Selection networks: Simulation of plasticity through reinforcement learning. — In: Neural Networks models of cognition (J. W. Donahoe & V. Packard  Dorsel, eds). Elsevier Science B. V., p. 336-357.
  • Donahoe, J.W., Burgos, J.E. and Palmer, D.C. 1993. A selectionist approach to reinforcement. J. Exp. Anal. Behav. 60: 17-40.
  • Donahoe, J.W., Palmer, D.C. and Burgos, J.E. 1997. The S-R issue: Its status in behavior analysis and in Donhaoe and Palmer's "Learning and Complex Behavior" (with commentaries and reply). J. Exp. Anal. Behav. 67: 193-273.
  • Gormezano, I. and Tait, R.W. 1976. The Pavlovian analysis of instrumental conditioning. Pavlov. J. Biol. Sci. 11: 37-55.
  • Götz, K.G. 1964. Optomotorische Untersuchung des visuellen Systems einiger Augenmutanten der Fruchtfliege Drosophila. Kybernetik. 2: 77-92.
  • Guo, A., Liu, L., Xia, S.-Z., Feng, C.-H., Wolf, R. and Heisenberg, M. 1996. Conditioned visual flight orientation in Drosophila; Dependence on age, practice and diet. Learn. Mem. 3: 49-59.
  • Hammerl, M. 1993. Blocking observed in human instrumental conditioning. Learn. Motiv. 24: 73-87.
  • Heisenberg, M. and Wolf, R. 1993. The sensory-motor link in motion-dependent flight control of flies. Rev. Oculomot. Res. 5: 265-83.
  • Konorski, J. and Miller, S. 1937a. Further remarks on two types of conditioned reflex. J. Gen. Psychol. 17: 405-407.
  • Konorski, J. and Miller, S. 1937b. On two types of conditioned reflex. J. Gen. Psychol. 16: 264-272.
  • Liu, L., Wolf, R., Ernst, R. and Heisenberg, M. 1999. Context generalization in Drosophila visual learning requires the mushroom bodies. Nature 400: 753-756.
  • Pavlov, I.P. 1927. Conditioned reflexes. — Oxford University Press, Oxford.
  • Pearce, J.M. and Hall, G. 1978. Overshadowing the instrumental conditioning of a lever press response by a more valid predictor of the reinforcer. Anim. Behav. Process. 20: 44-50.
  • Skinner, B.F. 1935. Two types of conditioned reflex and a pseudo type. J. Gen. Psychol. 12: 66-77.
  • Skinner, B.F. 1937. Two types of conditioned reflex: A reply to Konorski and Miller. J. Gen. Psychol. 16: 272-279.
  • Skinner, B.F. 1938. The behavior of organisms. — Appleton, New York.
  • St. Claire-Smith, R. 1979. The overshadowing of instrumental conditioning by a stimulus that predicts reinforcement better than the response. Anim. Learn. Behav. 7: 224-228.
  • Williams, B.A. 1975. The blocking of reinforcement control. J. Exp. Anal. Behav. 24: 215-225.
  • Williams, B.A. 1978. Informational effects on the response-reinforcer association. Anim. Learn. Behav. 13: 6-12.
  • Williams, B.A. 1999. Associative competition in operant conditioning: blocking the response-reinforcer association. Psych. Bull. Rev. 6: 618-23.
  • Williams, B.A., Preston, R.A. and DeKervor, D.E. 1990. Blocking of the response-reinforcer association additional evidence. Learn. Motiv. 21: 379-398.
  • Wolf, R. and Heisenberg, M. 1991. Basic organization of operant behavior as revealed in Drosophila flight orientation. J. Comp. Physiol. (A) 169: 699-705.
  • Wolf, R. and Heisenberg, M. 1997. Visual Space from Visual Motion: Turn Integration in Tethered Flying Drosophila. Learn. Mem. 4: 318-327.
  • Wolf, R., Wittig, T., Liu, L., Wustmann, G., Eyding, D. and Heisenberg, M. 1998. Drosophila mushroom bodies are dispensable for visual, tactile and motor learning. Learn. Mem. 5: 166-178.
homelearningevolutionmetabiology