Saturday 3 February 2018

Some thoughts on negative and positive reinforcement

Most of us, if we learn about learning theory, quickly become familiar with the following:

Negative reinforcement : the removal of something aversive from the environment as a consequence of a behaviour, making that behaviour more likely to occur in the future. 

Positive reinforcement: the addition of something rewarding into the environment as a consequence of a behaviour, making that behaviour more likely to occur in the future.

Negative punishment: the removal of something appealing from the environment as a consequence of a behaviour, making that behaviour less likely to occur in the future

Positive punishment: the addition of something aversive into the environment as a consequence of a behaviour, making that behaviour less likely to occur in the future.

Looks quite straightforward...

Then, we categorise the training and learning we see. We give our horse a polo when he touches a cone, and say we are training with positive reinforcement. We form biases about what kind of training is good and what is bad. Taken to an extreme, we form opinions about whether people are good or bad based on the kind of training they do, but that's a subject for another day! 

So - a couple of thoughts on positive and negative reinforcement. 

Is it positive or negative reinforcement? 

Let's say my horse Paddy has an itchy leg - he finds a tree stump, lifts his leg and rubs against it, removing the itch. We would generally describe this as negative reinforcement - the irritating itch has been removed by Paddy's behaviour - rubbing the tree stump. 

Now, let's say Paddy has an itchy leg and lifts his leg while I am grooming him. I reach down and scratch his leg for him. Is this positive reinforcement - I am adding a pleasant scratch, or negative reinforcement - removing the itch as in the previous case. Does Paddy learn any differently in the two cases above? 

Since it is the horse who is learning, it only really matters how his brain processes these two events - my opinion is unimportant in his learning process! But I'd be more likely to call the latter case positive reinforcement, because I added a good consequence. 

So already we see a grey area in how we classify the learning. 

Can  learning through negative reinforcement be a 'nice' experience? 
The example above, where Paddy finds the handy tree stump surely confirms that negative reinforcement can be an enjoyable method of learning - the relief of scratching the itch. 

To take an example involving a human rather than a bit of wood - let's say Paddy comes hobbling in from the field. He has a big stone trapped in his foot which I removed when he lifts his foot for me. We'd usually describe this as negative reinforcement - I've removed something aversive, and this was probably a very good experience for Paddy. 

How do we decide on our preferred training methods? 
If we think in terms of positive and negative reinforcement, and decide that positive reinforcement is good, kind, ethical, call it what you will, and negative reinforcement is to be avoided where possible - the above examples don't make sense! 

Not only is negative reinforcement a good experience for Paddy in theses cases, it's also sometimes not even clear if learning has occurred through negative or positive reinforcement - that just depends how we view it. 

When we are considering the effectiveness of training, then understanding the mechanics of learning is important. 

When we are considering the ethics of training, a major factor is how it feels to the learner, rather than how it affects their behaviour. Using a model that describes how their behaviour is affected to describe how they may feel is, I think, not enough. 

What really matters, I think, is how the horse feels about interaction we have with him. 

To take an extreme example, if we apply a painful stimulus to our horse then release it when he behaves as we wish, we are using negative reinforcement, but it is the active application of a painful stimulus that we should be concerned about, not the learning mechanism. Whether the aversive stimulus is applied before or after a behaviour, or entirely at random, for myself, I'd like to minimise these stimuli. I'd also like to increase the things he finds nice, whenever they happen! 

In summary - be nice to your horse :-)