What is Learned During Conditioning?

If there is a common mechanism underlying all types of learning, what associations are formed during conditioning?

In the context of operant conditioning, it has to be determined if the introduction of a discriminative stimulus into a 'purely' operant paradigm significantly alters the mode of acquisition of memory. One crucial experiment to find out to which degree a classical component is integrated in an operant paradigm, seems to be to train an individual operantly to use one output channel to optimize its stimulus situation (reinforcer and stimulus) and then test it by coupling a different output channel to the same environment (stimulus without reinforcer). Such an experiment has not been performed in Drosophila, yet.

In the context of classical conditioning, the issue is which properties of the contiguous events become encoded and associated. One consequence of the notion that stimulus-reinforcer associations were formed during classical training has gathered much attention in the literature: if simple stimulus substitution were to account for learning, the CR should be identical to the UR. Several observations seem to indicate that this is not the case. Rabbits for instance, respond with swallowing and jaw movements during training in a salivary conditioning paradigm, but fail to show these behaviors during test (Sheffield, 1965; cited in Rescorla and Holland, 1982). Or the CR might include behaviors not present in the UR: Pavlov's abovementioned dog that showed appetitive behavior towards the bell is one example of motor activity towards CSs paired with food, although activity is not part of the response to food itself. Pigeons peck visual signals for USs that do not elicit pecking, such as water delivered directly into the mouth (Woodruff and Williams, 1976; cited in Rescorla and Holland, 1982) or heat (Wasserman, 1973; Wasserman et al. 1975; cited in Rescorla and Holland, 1982). Spear et al. (1990) cite Pinel et al. (1980) where conditioning is expressed by suppression if a tone predicted the US (shock) but by active hiding if the US was signaled by a prod.

As noted above (4), some flies observed during this study confirm the evidence in the literature: they produced spike volleys and a shift in the torque baseline when heated under open loop conditions (Fig. 17), clearly to be classified as an UR to the heat. This behavior disappears completely if the heat is switched off (Reinhard Wolf, pers. comm.); a trace of it, however, can be detected in closed loop: the spikes produced in quadrants with the previously heated pattern are larger and closer together than in the other sectors.

Moreover, it has been shown that conditioning still does occur if the stimulus-properties of the US are suppressed: stimulation of the VUMmx1-neuron in bees can serve as substitution for the sugar-reinforcer (Hammer, 1993). Suppression of the response-evoking properties of the US, for instance by applying response attenuating drugs such as curare does not prohibit learning either (Solomon and Turner, 1962; cited in Rescorla and Holland, 1982), ruling out direct stimulus-response associations in classical conditioning.

If the view of singular response-reinforcer or stimulus-reinforcer or stimulus-response associations is so oversimplified, what is happening in the brain of a conditioned organism?

Assume an animal struggling for survival: every sensation might provide a clue how to escape a predator, find a mate, explore new food patches, hiding places, etc. In every second it is confronted with potentially dangerous or advantageous situations. The possibility to predict such situations must convey an enormous selection pressure. A very effective way to accomplish this task would be an evaluation mechanism, judging situations according to their 'beneficence' for the individual. With such a mechanism salient internal and external stimulus-arrays extracted from the situation would receive situation-specific rankings on a value-scale in terms of 'good', 'bad' or 'neutral'. The probability of performing a given behavior in a certain stimulus situation is the manifestation of a more or less complex superposition of stimulus-rankings, motivation and initiating activity. In this picture learning corresponds to linking neutral or unknown stimuli to already ranked ones, whenever a sufficient crosscorrelation between them is detected. In 'pure' operant conditioning, the internal representation of behaviors (efference copy, von Holst and Mittelstaedt, 1950) is linked to the ranking of the reinforcer if the correlation coefficient between them is sufficiently positive. If more stimuli are contingent with the reinforcer, they receive adequate rankings as well. The richer the environment, the more complex the net between the stimuli becomes. An a priori ranking of stimuli and behavioral representations constitutes the basis for URs, fixed action patterns, and the species-specific salience and associability of certain stimuli. As mentioned above, many of these rankings are assumed to be situation specific. Situation-specific in this case means the situation in which the ranking has been acquired. For instance, if the reinforcer was food, the contiguous behavior(s) or event(s) would receive a food-specific ranking rather than a mate- or danger-specific ranking. Prolonged training intensifies the links (associations) between the stimuli. This might lead to such prominent behavior as described above for the chimpanzees.

Such an informal model is in conformity with several attentional models for acquisition (e.g. Rescorla and Wagner, 1972; Mackintosh, 1975), describing overshadowing (Pavlov, 1927) or blocking (Kamin, 1969), since the distance of stimuli from the value 'neutral' might convey them with the appropriate attentional properties. It fully accounts for differences in UR and CR by not only transferring the US value to the stimulus, but also by linking the rankings of different US features to the appropriate behavioral rankings (situation-specificity). Sensory preconditioning (in rats: Rescorla and Cunningham, 1978; in bees: Müller et al. 1996) is predicted by this model as the ranks of contiguous (or similar for that matter) stimuli would become linked.

Such a model would imply that as soon as one salient sensory stimulus is presented contiguous with the reinforcer in an operant conditioning paradigm, the subject will use all output-channels to respond appropriately during test.

Previous Top Next