Operant conditioning in depth (background substance)
Learning through consequences: reinforcement (positive/negative, primary/secondary), shaping and chaining.
This study is applied operant conditioning — so master the mechanism and the study explains itself.
Operant conditioning (Skinner) = behaviour is learned through its consequences:
- Positive reinforcement = adding something pleasant after a behaviour → behaviour increases (e.g. giving an elephant food after it lifts its trunk).
- Negative reinforcement = removing something unpleasant → behaviour increases (NOT used as the main method here — the point of the study is to avoid unpleasant/harsh methods).
- Punishment = a consequence that decreases behaviour (traditional elephant training often used punishment — this study offers an alternative).
Primary vs secondary reinforcers:
- Primary reinforcer = satisfies a basic need directly (e.g. food — banana, sugar cane).
- Secondary reinforcer = a neutral signal that becomes rewarding through association with a primary one (e.g. a clicker, whistle or the word 'good'). This is secondary positive reinforcement (SPR) — the trainer can mark the exact moment of correct behaviour, then give food.
Shaping = reinforcing successive approximations — rewarding behaviours that get closer and closer to the target (e.g. first any trunk movement, then trunk to the bucket, then trunk in the bucket).
Behavioural chaining = linking several trained behaviours into a sequence to perform a complex action — here, the full trunk wash is a chain of separate trained steps.
- Operant = learning by consequences.
- Positive reinforcement = add reward → behaviour increases.
- Primary reinforcer = food; secondary reinforcer = signal ('good'/clicker) linked to food.
- Shaping = reward successive approximations toward the target.
- Chaining = link trained behaviours into a sequence (the trunk wash).