Bob Kentridge 1995

Comparative Psychology: Lecture 8.

How general are learning processes?

Over the past seven lectures we have looked at the two main types of animal learning - classical and operant conditioning. After comparing these two types of learning I want to return to a more overtly comparative theme and look at the limits to these types of learning in different species. I will then look at more complex learning tasks and ask how species differ in their performance across these tasks. I then want to try and integrate these results, putting learning into a more general evolutionary framework and finally addressing the question of how we should view our learning abilities in this light.

Operant and Classical Conditioning.

Before looking at differences between the learning abilities of species it is as well to quickly consider the similarities and differences between operant and classical conditioning. The most obvious difference between them is the nature of the response - in classical conditioning it must be a reflex whereas in instrumental learning it is a previously neutral piece of behaviour. In addition, in operant conditioning there is a contingency between response and reinforcement while in classical conditioning no such contingency applies - whether an animal makes a CR has no effect on the presentation of the US. Are these differences artefacts of the different ways that learning has been studied in the laboratory? Certainly when we considered the operant learning procedure in detail there appeared to be classical aspects to it - the animal almost certainly makes various associations to the US of food, probably including the whole setting of the experiment, the sound of the pellet dispenser and perhaps the discriminative stimulus. In classical conditioning it is often possible to view the CR as an operant - salivation increases the palatability of food so one might view salivation as an operant and the toll of the bell as a discriminative stimulus. Similarly, heart-rate reduction might decrease the unpleasant effect of shock and so be regarded as an operant. This leads on to the question of whether the response in an overtly operant learning paradigm must be neutral - is it possible to operantly condition reflexes? Experiments on operant conditioning of heart-rate elevation in rats indicate that it is. Normally one would regard changes in heart-rate as being reflexive - we do not appear to have voluntary control over them. We might, however, elevate our heart-rate indirectly by running or making other voluntary movements. This confounding factor was eliminated in a classic experiment where rats were operantly conditioned with a contingency between a transient elevation of heart-rate and direct electrical brain-stimulation reward (which is about the most powerful reinforcer available) while paralysed with curare (another reason for using brain-stimulation, as it requires no active consumatory behaviour). It is also possible to find examp- les where a non-reflexive response appears to be classically conditioned. Often the operant required from pigeons in operant conditioning experiments is a peck on a lighted key - during continuous reinforcement training each peck is reinforced with delivery of a seed of some grain. Once the bird has been trained to eat the grain from its dispenser then if the key is lit ad grain is dispensed without any pecking contingency the pigeon will soon begin to peck at the key. If the pecks were directed at the food then this would be a simple reflex. If pecks aimed at the food hopper were elicited by the light then we might characterise the behaviour as straightforward classical conditioning, but pecks directed at the light do not seem reflexive yet they are sensitive to the contingency and predictability factors that determine the effectiveness of classical and not operant conditioning. Rather than operant and pavolian conditioning being distinct categories of learning it begins to appear that all learning situations contain both operant and pavlovian qualities in different degrees. A great deal of research has been carried out investigating how these facets interact.

Constraints on learning.

Both instrumental learning and classical conditioning have been presented up to now as quite general models of learning. It should be possible to condition any neutral behaviour using any reinforcer and an operant contingency. It should be possible to associate any neutral stimulus with a reflexive response in a classical conditioning procedure. Although Skinner and Pavlov both believed that they were dealing with general theories of learning with the properties I have just described neither of the two previous statements is true. The most dramatic example of the lack of generality of learning is conditioned taste aversion (CTA). This is an example of classical conditioning, the UR is the set of behaviours accompanying gastro-intestinal distress, induced by a gastric irritant as A US - injections of the simple salt lithium chloride (LiCl) are most often used. The CS is some quality of a novel food. If one pairs a novel tasting food with LiCl then part of the CR for a rat is a subsequent refusal to consume the novel tasting food. In principle it should be just as easy to condition the rat to avoid novel foods on the basis of visual, or any other CS, characteristic, however, this is not the case. Rats fail to learn CTA associations to foods with novel visual characteristics. In contrast, birds can be conditioned to avoid drinking water coloured in novel ways, or even paired with distinctive noise (bright-noisy-water) but fail to learn CTA to novel flavours or odours. The sensitivity of rats to associating odour cues with the outcomes of novel tasting foods is such that if a rats tastes a novel food in the presence of another rat which is made sick (but which has not eaten the novel food ) the first rat will subsequently avoid the novel food. Similar effects occur in operant conditioning. Maybe the most celebrated examples concern the problems of training animals for roles in TV commercials. The classic case involved training a racoon to put a penny (a 1 cent coin) into a piggy bank. The racoon could initially be trained easily using food reward but soon, instead of putting the penny straight into the piggy bank as trained it spent increasing amounts of time sniffing it, chewing it and directing other typical racoon food-related behaviours towards it. The same sort of reinforcer specific behaviours have also been observed with aversive conditioning - animals have species typical response to threat and these can soon being to dominate the operant they have been trained to produce.

Adaptation of Learning as a General Process?

Examples like these are very hard for theories of learning which purport to have general applicability to deal with. If we take a wider evolutionary viewpoint they become much easier to understand. Learning is just one way for an individual to adapt to its environment. Some aspects of the environment vary so slowly that natural selection will result in adaptations of a species (phylogenetic adaptation) allowing individuals to exploit or cope with these slowly varying qualities - developing a metabolism which can cope with particular diets or temperatures, for example. It is also possible to adapt to aspect of the environment which are a little more variable during an individual's development (epigenisis) - developing different amounts of body fat depending on diet and environment during the early stages of life is an example. Once can even view aspects of sensori-motor coordination - coordinating visual signal and motor commands in grasping a fruit for example - as a form of adaptation which takes place on a very short time-scale. Somewhere between sensorimotor coordination and epigenisis lies learning. Once learning is seen in a general evolutionary framework it becomes clear that predictability of the types of learning tasks which will often be encountered may drive phylogenetic adaptation of learning preferences. In the long term it may be a consistent feature of the environment-organism pair that taste receptor responses to the composition of food are good predictors of its nutritional outcome. But, on the same long time scale it is not possible to consistently pair particular tastes with particular outcomes. An adaptation can, still, however, be made to preferentially learn about taste-nutrition relationships.

'Higher' learning.

As we near the end of the comparative psychology course it is natural to worry about the status of humankind (PC alert) in all this. A naive person might assume that this has nothing to do with how people learn or what drives their behaviour. Even if it is true that we are quite different from animals we cannot, of course, be sure of this without some evidence (unless we side with Descartes (who at least understood the importance of evidence) or, perhaps more fairly to Descartes, more conservative members of 17th century society). Let us consider species differences in some more complex learning tasks.

Complex Learning Tasks.

I have mentioned some tasks which might be though to tax 'higher' faculties than straightforward instrumental or classical conditioning already. One is the problem of matching response allocation between two or more differently valued alternatives, another is sensitivity to changes in the value of a reinforcer - having expectations. Other commonly used tasks include the ability to learn that the same task may change in its outcome while retaining the same general form - for example imagine a task in which an animal must respond on one of a with a pair of levers during hour long daily sessions, but the reinforced lever changes randomly between days - if an animal learns this 'meta' task then the speed with which it learns to respond to the correct lever each day will increase. As we have seen it is possible to classically condition nearly anything. Similarly, all vertebrates seem sensitive to well chosen operant contingencies (it is not use trying to condition a rat using coloured discriminative stimuli as they are colour blind). It had been believed, however, that species differences between vertebrates could be detected in these complex tasks. A number of investigators have, however, found effective probability (matching), reward-shift and reversal learning it fish, amphibia and reptiles provided the stimuli, reinforcers and task setting are tailored to suit the species under investigations. Provided the task is set up so it is akin to one the species may need to learn in its environment then it seems that all vertebrates can learn these more complex tasks. Of course, it is reasonable to argue that these tasks are trivial for humans and that all of our learning is mediated by reflective thought - by intentional explicit problem solving. If this were true what prediction might we make about human learning capabilities?

Human learning, thought and language.

It is clear that, souls apart (for the seventeenth century conservatives amongst us), what distinguishes humans from animals is language. This is not a course on language or cognitive psychology, and I leave it up to those courses to teach you about the nature of language. Although there have been many long complex studies aimed at discerning language-learning abilities in non-humans the evidence for this is weak. Language is not the same as communication or about the association of arbitrary symbols with things and events in the real world - rats can learn to use tools we choose to give them to fulfil these functions quite easily. Language is about learning to use relationships between symbols to denote specific consequences. In our language we use complex structures in which the relationships between pairs of words in a sentence separated by many intervening phrases can nevertheless determine the meaning of a sentence. The rules which implicitly govern the interpretation of structures like this are quite different from the 'if-then' contingencies governing operant conditioning or even the more complex animal learning tasks I described earlier. Does this make animal learning theory irrelevant to the understanding of human behaviour? It might if we thought all of our actions out carefully, but we don't. (t might if we had the ability to use language from birth, but we don't. Given this, however, we still need to discover whether the principles of animal learning apply to those aspects of human behaviour which are not linguistically mediated.

Human conditioning.

Little Albert is often cited as an early example of classical conditioning in humans. In fact the iron bar was struck as Albert moved to touch the rat so it may, in fact be an example of o- operant punishment with the rat as a discriminative stimulus. There have, however, been numerous better controlled studies of classical conditioning in humans. In particular it has been shown that, given similar species specific constraints to those I discussed earlier, that new-born humans classically condition well. A number of techniques used to treat people with phobias are closely based on classical conditioning principles - the fact that these procedures are effective in modifying patients' behaviour indicates that these phobias may themselves have been the result of earlier fortuitous classical conditioning. Babies have been are very good at instrumental learning. It is, perhaps, less easy to show operant than classical conditioning in adults. This is not because adults are insensitive to contingencies of reinforcement of do not learn about them but because they often find it easy to identify them explicitly and, for out purposes, if they can articulate the contingency they might not be doing the same type of learning that animals do. In a well devised experiment, however it is easy to demonstrate covert conditioning in adults. A typical apocryphal example is the class that was quiet and well behaved when their lecturer stood on the left side of the theatre but noisy when he stood to the right. Gradually they shaped his behaviour by moving the position at which they switched their behaviour gradually towards the door until their lectures were delivered from the corridor outside. In a quite non- apocryphal experiment subjects were successfully conditioned to blink their eyes on an FR 8 schedule while they actually believed they were working on a key pressing task - there was no contingency between key-pressing an reward and their key- pressing rate did not increase during the experimental period while their eye-blink rate did, yet they had no idea that their eye-blinks were determining the delivery of reinforcement. It is possible to demonstrate all the features of animals responses to different schedules of reinforcement provided suitable precautions are taken to avoid verbal mediation of tasks - humans produce FI scallops provided they cannot, or have no reason to, count the interval in their heads. Operant principles have been and still are used to treat behavioural disorders and have been used in various educational schemes. It is not, however, these overt application of conditioning which are so important, it is more that in the course of our lives we are subject to so many contingencies and correlations we are not consciously aware of which we are quite capable of learning without ever knowing. We may think that our behaviour is determined quite differently from that of animals, but, from introspection, who are we to know?