the joys of positive reinforcement

I was walking one of my clients’ dogs today and came across a sticky situation. My dog has had aggression issues towards other dogs (and sometimes people) so her owner and past trainer have done extensive work to desensitize and counter-condition her towards other dogs. So here I am, walking her down a busy NYC street when I come across the following scenario:

(yes, I have pink sneakers).

I first see Person B walking ahead of us – a man walking at least four dogs at one time. He is struggling to keep them under control while picking up one dog’s poop.

Then I see Person A coming towards us from the other end of the sidewalk. He is holding onto a boxer-mix type dog.

B sees A. B starts panicking and pulls his dogs toward the buildings to his right. A starts wrapping his dog’s leash multiple times around his wrist. Dogs of both A and B are pulling and lunging towards each other.

Then there’s me and my dog. My dog looks at me. I click and treat. While the other dogs are struggling to pull away from their human, mine is happy as a clam and staring straight at me.

ah the rewards of being a positive reinforcement trainer. ( :

the foundation: discriminative stimuli

In the last foundation post, I introduced the ABCs of ABA. Antecedent, Behavior, and Consequence. Today I am going to introduce another set of letters: the SD. SD stands for discriminative stimulus (read: ess-dee). In an operant condition, such as the ABC examples I posted last time, the SD is basically the same thing as the antecedent stimulus. It is the stimulus that preceded the behavior in the past that resulted in a consequence.

In the “sit” example,

Antecedent Behavior Consequence
owner says “sit” dog sits owner praises and pats dog

the owner saying “sit” is considered the SD. The owner saying “sit” is correlated with the fact that if the dog emits the correct behavior (sitting down), reinforcement is available. However, if the dog lies down instead, the owner will not (or should not, at least) give the dog praise.

This is a big problem in dog training when the owner is not clear with her dog about what behavior she wants. Many dogs will default to sitting down when they see a piece of food in their owner’s hand. Or they will sit when the owner stands a certain way in front of them. This just shows that the SD in those cases are not the vocal cue “sit”, but the presence of the food or the owner’s stance. If your dog only sits when you have food in your hand, that’s because you’ve taught them that’s the SD for sitting – you want to change that as soon as you can. I will be posting a “sit” protocol on this blog soon!

A human example: see chocolate bar (SD): open chocolate bar (behavior) –> eating chocolate (consequence = yum). The presence of the chocolate bar in the future will most likely end up with you eating it. Because it’s automatically reinforcing.

In summary, SDs are basically the signal that tells someone or something what behavior to emit.

clicker training

Many dog trainers and owners these days use clicker training. The principles of clicker training are rooted in positive reinforcement and the conditioned reinforcer. Clickers can look like this:

or this: or this:

and maybe there are others – these are just some i found after a simple google search for “clicker”. They all make the same clicking noise (although the image in the middle is of a i-click which is a bit quieter than the average clicker). The clicking noise is paired with food. We (trainers) like to use high value food – better than the usual milkbone or “low-quality” dog treats that owners freely give to their dogs. We like to use chicken, steak, salmon, etc. The clicker is paired with the food by clicking and then immediately feeding the dog a piece of food. The dog then associates the clicker as a predictor of food so the clicker itself becomes a conditioned reinforcer. Many times when owners just take out the clicker, dogs get excited! They know good things are to come.

Tips on pairing the clicker with food:

  • be in a neutral position – don’t make eye contact with the dog and try to vary your body positions. this way, the dog will not associate your body position or eye contact or anything else you may be consistently doing to the food, but just the sound of the clicker.
  • always feed immediately after clicking
  • the food should be of high value
  • this does not take long – no more than 15-20 click/food pairings should be sufficient

Why do we use clickers?
Whenever we want to teach a new behavior to an animal, we want to use positive reinforcement. And usually, the best reinforcer to use is food. We are, in essence, playing a game with the animal. Telling them – if you do this behavior, you will be rewarded with food. However, during training, it is important that the consequence follows immediately after the behavior. For example, if you are teaching the behavior, the organism elicits the behavior, then thirty seconds later you give the food, who knows what happened in those thirty seconds?? The animal may have been doing a totally unrelated behavior and may not make the association between the behavior and the food. Therefore, we use the clicker as a bridging stimulus. It bridges the gap between the behavior and the reinforcer. We are able to click (after some human learning) faster than we can give the animal food. It increases efficiency in animal training.

the foundation: reinforcers & punishers

There are two classes of reinforcers & punishers: unconditioned and conditioned.

Unconditioned reinforcers & punishers are those that are innately reinforcing or punishing to us. As in, ones that do not require any learning. Some examples of unconditioned reinforcers include food, water, oxygen, and sex. Some unconditioned punishers are extreme temperatures, food (when you are full), eating, and pain.

Conditioned reinforcers & punishers are those that have been paired with other reinforcers or punishers in the past so in themselves have become reinforcing or punishing. These conditioned stimuli vary from person to person. While one stimulus can be a reinforcer for me, it may be punishing to another person. Conditioned reinforcers can be just as powerful as unconditioned ones. An example of a personal conditioned reinforcer is jewelry. An example of a personal conditioned punisher is cheese (blech).

Dog trainers use a lot of unconditioned (treats) and conditioned (clickers) reinforcers. Usually the conditioned reinforcer is paired with food and thus becomes a reinforcer. However, this in no way has to be a clicker – one could also use one’s voice or a hand motion.

More on clicker training coming up in our next training tips post!

the foundation: ABC – the three-term contingency

So far we have talked about two major topics in ABA: reinforcement and punishment. These are both operant operations – ones that are based on the consequences of the behavior. Behavior analysts use the basic “unit” of ABC when analyzing operant behavior:

Antecedent
Behavior
Consequence

The ABC is a three-term contingency which outlines that in the presence of an antecedent stimulus, the behavior occurs. The consequence then strengthens that antecedent-behavior relationship so that in the future, the behavior will occur more or less frequently in the presence of that antecedent. I think this is easiest to demonstrate with examples:

Antecedent Behavior Consequence
owner says “sit” dog sits owner praises and pats dog
you see refrigerator open fridge get food
raining outside take umbrella stay dry
red traffic light speed up get into car accident
ad pops up on PC click ad PC shuts down

In the above table, you can see that there is something that immediately precedes a behavior. The consequence that occurs after the behavior will then determine if the behavior will occur more or less frequently in the future. The first four ABCs are examples of positive reinforcement, while the last two are examples of positive punishment. Can you make out why?

I’ll explain it for the first example. Remember, reinforcement is the increase in the future probability of the behavior. In the past, whenever the owner said “sit” and the dog sat, the owner would give the dog a treat or praise. This will increase the frequency of the dog sitting when the owner says “sit” in the future. Try doing that for the remaining 5 examples!

Three-term contingencies are all around us. Can you think of any real-world examples around you?

Next post: Discriminative Stimuli

the foundation: positive + negative reinforcement

now that you know what the true definition of reinforcement + punishment are, it’s important to clarify the terms positive and negative. These terms are also ones that already have common misconceptions in our everyday language. Positive = a good thing, and negative = a bad thing. However, this is not applicable to ABA.

Positive denotes the presentation of a stimulus – something that had not been there before the behavior.

Therefore, positive reinforcement occurs when after a behavior, something that was not there before is presented, and this causes the frequency of the behavior to increase in the future.

Example: Your dog is standing in front of you. You put some treats in your hand, and lift your hand up over the dog’s nose. This will cause your dog to bend his head as far back as possible until it will cause him to sit down. You then give him the treats. You repeat this one more time and the same thing happens. You remove the treats from your hand and just lift your empty hand, and your dog sits. Now every time you lift your hand in front of your dog, your dog sits.

In this case, the treats that are given to the dog after he sits, is the new stimulus that is presented (positive) and since the sitting behavior increases, it is reinforcement.

Negative denotes that something that was already in the environment is taken away.

Therefore, negative reinforcement occurs when after a behavior, something that was present in the environment is removed, and this causes the frequency of the behavior to increase in the future. This is usually associated with aversive stimuli.

Example: You get a mosquito bite on your leg, and it itches. A lot. You know you shouldn’t, but you scratch it. When you scratch it, the itching goes away and it feels good. So in the future, you scratch your mosquito bite more often to get rid of the itching.

In this scenario, the itch is the stimulus that is taken away (negative) by the behavior, and since scratching increases in the future, it is reinforcement.

Next post: Positive & Negative Punishment