the joys of positive reinforcement

I was walking one of my clients’ dogs today and came across a sticky situation. My dog has had aggression issues towards other dogs (and sometimes people) so her owner and past trainer have done extensive work to desensitize and counter-condition her towards other dogs. So here I am, walking her down a busy NYC street when I come across the following scenario:

(yes, I have pink sneakers).

I first see Person B walking ahead of us – a man walking at least four dogs at one time. He is struggling to keep them under control while picking up one dog’s poop.

Then I see Person A coming towards us from the other end of the sidewalk. He is holding onto a boxer-mix type dog.

B sees A. B starts panicking and pulls his dogs toward the buildings to his right. A starts wrapping his dog’s leash multiple times around his wrist. Dogs of both A and B are pulling and lunging towards each other.

Then there’s me and my dog. My dog looks at me. I click and treat. While the other dogs are struggling to pull away from their human, mine is happy as a clam and staring straight at me.

ah the rewards of being a positive reinforcement trainer. ( :

the foundation: discriminative stimuli

In the last foundation post, I introduced the ABCs of ABA. Antecedent, Behavior, and Consequence. Today I am going to introduce another set of letters: the SD. SD stands for discriminative stimulus (read: ess-dee). In an operant condition, such as the ABC examples I posted last time, the SD is basically the same thing as the antecedent stimulus. It is the stimulus that preceded the behavior in the past that resulted in a consequence.

In the “sit” example,

Antecedent Behavior Consequence
owner says “sit” dog sits owner praises and pats dog

the owner saying “sit” is considered the SD. The owner saying “sit” is correlated with the fact that if the dog emits the correct behavior (sitting down), reinforcement is available. However, if the dog lies down instead, the owner will not (or should not, at least) give the dog praise.

This is a big problem in dog training when the owner is not clear with her dog about what behavior she wants. Many dogs will default to sitting down when they see a piece of food in their owner’s hand. Or they will sit when the owner stands a certain way in front of them. This just shows that the SD in those cases are not the vocal cue “sit”, but the presence of the food or the owner’s stance. If your dog only sits when you have food in your hand, that’s because you’ve taught them that’s the SD for sitting – you want to change that as soon as you can. I will be posting a “sit” protocol on this blog soon!

A human example: see chocolate bar (SD): open chocolate bar (behavior) –> eating chocolate (consequence = yum). The presence of the chocolate bar in the future will most likely end up with you eating it. Because it’s automatically reinforcing.

In summary, SDs are basically the signal that tells someone or something what behavior to emit.

the ABCs of desensitizing & counter-conditioning dogs that are agressive towards other dogs

All of the dogs I work with have/had aggression problems toward other dogs. How do we remedy the situation? Desensitize the dogs to other dogs and counter-condition them to learn that other dogs are predictors of good things. Here’s how it works:

  1. Dog sees other dog on street
  2. Owner clicks & treats immediately (as in, before dog reacts)
  3. Dog gets chicken/salmon/good stuff

Dog learns that seeing another dog = yummy food and soon learns to associate other dogs with food. The ABC then becomes:

A: dog sees other dog on street
B: dog looks at owner
C: dog gets treat

Dog will then learn to look at owner to anticipate treat.

Positive reinforcement is yummy.

clicker training

Many dog trainers and owners these days use clicker training. The principles of clicker training are rooted in positive reinforcement and the conditioned reinforcer. Clickers can look like this:

or this: or this:

and maybe there are others – these are just some i found after a simple google search for “clicker”. They all make the same clicking noise (although the image in the middle is of a i-click which is a bit quieter than the average clicker). The clicking noise is paired with food. We (trainers) like to use high value food – better than the usual milkbone or “low-quality” dog treats that owners freely give to their dogs. We like to use chicken, steak, salmon, etc. The clicker is paired with the food by clicking and then immediately feeding the dog a piece of food. The dog then associates the clicker as a predictor of food so the clicker itself becomes a conditioned reinforcer. Many times when owners just take out the clicker, dogs get excited! They know good things are to come.

Tips on pairing the clicker with food:

  • be in a neutral position – don’t make eye contact with the dog and try to vary your body positions. this way, the dog will not associate your body position or eye contact or anything else you may be consistently doing to the food, but just the sound of the clicker.
  • always feed immediately after clicking
  • the food should be of high value
  • this does not take long – no more than 15-20 click/food pairings should be sufficient

Why do we use clickers?
Whenever we want to teach a new behavior to an animal, we want to use positive reinforcement. And usually, the best reinforcer to use is food. We are, in essence, playing a game with the animal. Telling them – if you do this behavior, you will be rewarded with food. However, during training, it is important that the consequence follows immediately after the behavior. For example, if you are teaching the behavior, the organism elicits the behavior, then thirty seconds later you give the food, who knows what happened in those thirty seconds?? The animal may have been doing a totally unrelated behavior and may not make the association between the behavior and the food. Therefore, we use the clicker as a bridging stimulus. It bridges the gap between the behavior and the reinforcer. We are able to click (after some human learning) faster than we can give the animal food. It increases efficiency in animal training.

the foundation: reinforcers & punishers

There are two classes of reinforcers & punishers: unconditioned and conditioned.

Unconditioned reinforcers & punishers are those that are innately reinforcing or punishing to us. As in, ones that do not require any learning. Some examples of unconditioned reinforcers include food, water, oxygen, and sex. Some unconditioned punishers are extreme temperatures, food (when you are full), eating, and pain.

Conditioned reinforcers & punishers are those that have been paired with other reinforcers or punishers in the past so in themselves have become reinforcing or punishing. These conditioned stimuli vary from person to person. While one stimulus can be a reinforcer for me, it may be punishing to another person. Conditioned reinforcers can be just as powerful as unconditioned ones. An example of a personal conditioned reinforcer is jewelry. An example of a personal conditioned punisher is cheese (blech).

Dog trainers use a lot of unconditioned (treats) and conditioned (clickers) reinforcers. Usually the conditioned reinforcer is paired with food and thus becomes a reinforcer. However, this in no way has to be a clicker – one could also use one’s voice or a hand motion.

More on clicker training coming up in our next training tips post!

the foundation: ABC – the three-term contingency

So far we have talked about two major topics in ABA: reinforcement and punishment. These are both operant operations – ones that are based on the consequences of the behavior. Behavior analysts use the basic “unit” of ABC when analyzing operant behavior:

Antecedent
Behavior
Consequence

The ABC is a three-term contingency which outlines that in the presence of an antecedent stimulus, the behavior occurs. The consequence then strengthens that antecedent-behavior relationship so that in the future, the behavior will occur more or less frequently in the presence of that antecedent. I think this is easiest to demonstrate with examples:

Antecedent Behavior Consequence
owner says “sit” dog sits owner praises and pats dog
you see refrigerator open fridge get food
raining outside take umbrella stay dry
red traffic light speed up get into car accident
ad pops up on PC click ad PC shuts down

In the above table, you can see that there is something that immediately precedes a behavior. The consequence that occurs after the behavior will then determine if the behavior will occur more or less frequently in the future. The first four ABCs are examples of positive reinforcement, while the last two are examples of positive punishment. Can you make out why?

I’ll explain it for the first example. Remember, reinforcement is the increase in the future probability of the behavior. In the past, whenever the owner said “sit” and the dog sat, the owner would give the dog a treat or praise. This will increase the frequency of the dog sitting when the owner says “sit” in the future. Try doing that for the remaining 5 examples!

Three-term contingencies are all around us. Can you think of any real-world examples around you?

Next post: Discriminative Stimuli

the foundation: positive + negative punishment

Review:

  • punishment – the decrease of a behavior contingent on a consequence
  • positive – the presentation of a stimulus
  • negative – the removal of a stimulus

It is oh-so-important to keep in mind that punishment is not necessarily an aversive, bad, painful, emotional, (insert any word normally associated with punishment), thing, although many times, it can be. (*Disclaimer: Please keep in mind that punishment procedures can be ethically challenging, and this post is just informative and is by no means a promotion or demotion of using it in the real world)

This post will basically mirror the previous foundation post for consistency and hopefully ease of understanding.

According to the definitions above, positive punishment is the presentation of a stimulus following a behavior, that causes that behavior to decrease/go away.

Example: One evening, I left the car radio on before I turned the engine off. The next morning when I started the car, the radio BLASTED music making me jump. The rush of blood to my head and being startled taught me to always turn the radio off before I turn off the engine.

This is a very common experience with painful experiences. The loud music was presented to (& scared) me (positive), so that really decreased the keeping-radio-on-when-turning-off-engine behavior in the future (punishment).

Along the same lines, negative punishment is when, after a behavior, something is removed from the environment, causing the future probability of that behavior occurring to decrease. The stimulus that is removed in negative punishment procedures is usually something that the organism finds reinforcing.

Example: You are playing tug-of-war with your dog. woohoo! it’s a lot of fun! the dog loves it, you love it. but, suddenly, fido’s teeth come a bit too close to your side of the tug toy – as in, you feel his teeth on your skin. you immediately drop your end of the tug toy, which makes the tug toy now lifeless and not-so-interesting for your dog. After a few seconds, you pick it up again and play! But next time the dog’s teeth come too close, you drop the toy and look away. In the future when you play tug-of-war with fido, fido’s teeth don’t come near you.

In this story, the tug-of-war toy (which is very reinforcing for the dog) is removed (negative) after the behavior of teeth touching skin. Furthermore, since the behavior goes away, it is punishment. This is a great way to teach your dog how to play tug-of-war nicely! Your dog will soon learn that touching teeth to skin means no more fun.