Operant Conditioning

Law of Effect

Edward L. Thorndike, an American Psychology student, explored how animals solve problems. He developed the idea of instrumental learning, concluding that an organism’s behaviour is instrumental in bringing about certain outcomes (consequences). He proposed the law of effect, which states that responses followed by satisfying consequences will be strengthened and become more likely to occur, whereas responses followed by unsatisfying outcomes will be weakened and become less likely to occur. This law serves as the basis for operant conditioning, which is akin to Thorndike’s theories.

Skinner’s Analysis of Operant Conditioning

B. F. Skinner analyzed operant conditioning in terms of relations between antecedents (stimuli that are present before a behaviour occurs), behaviors that the organism emits, and consequences that follow the behaviors.

If antecedent stimuli (A) are present

And behaviour (B) is emitted

THEN consequence will occur.

The relations between A and B, and between B and C, are called contingencies.

Antecedents

Antecedents that signal the likely consequences of particular behaviors in a given situation are called discriminative stimuli.

Behaviors

Operant behaviors are emitted (under voluntary control), whereas classically conditioned responses are elicited (triggered involuntarily by reflex). Classically conditioned responses are influenced by what happens before the behaviour (e.g. by the CS-UCS pairing), whereas operant behaviors are influenced by consequences that occur after the behaviour.

Consequences

Reinforcement occurs when a response is strengthened by an outcome (a reinforcement) that follows it.

With positive reinforcement, a response is followed by the presentation of a positive stimulus, so the response becomes stronger. Primary reinforcers are stimuli, such as food, that an organism naturally finds reinforcing because they satisfy biological needs. Through their association with primary reinforcers, other stimuli can become secondary, or conditioned, reinforcers (e.g. money).

With negative reinforcement, a response is followed the removal of an aversive stimulus, so again, the response becomes stronger. A negative reinforcer is the stimulus that is removed or avoided.

Operant extinction is the weakening and eventual disappearance of a response because it is no longer reinforced. The degree to which nonreinforced responses persist is called resistance to extinction, which is strongly influenced by the pattern of reinforcement that has previously maintained the behavior.

Punishment occurs when a response is weakened by an outcome (a punishment) that follows it. With aversive punishment, a behavior is followed by the presentation of an aversive stimulus, and the behaviour becomes weaker. Punishment suppresses behavior, but does not cause the organisms to forget how to make the response. This suppression may not generalize to other relevant situations.

With response cost, a behavior is followed by the removal of a positive stimulus, and the behavior becomes weaker.

For examples of reinforcement and punishment, check out the following site: reinforcement examples

Delay of gratification is the ability to forego an immediate but smaller reward for a delayed but more satisfying outcome.

Processes

Shaping, which uses the method of successive approximations, involves the reinforcement of behaviors that increasingly resemble the final desired behaviour. By reinforcing successive approximations, acquisition time is drastically reduced.

Chaining is used to develop a sequence (chain of responses) by reinforcing each response with the opportunity to perform the next response.

When behaviour changes in one situation due to reinforcement or punishment, and then this new response carries over to similar situations, this called operant generalization. In contrast, when an operant response is made to one discriminative stimulus but not to another, this is called operant discrimination.

Escape and avoidance conditioning result from negative reinforcement. In escape conditioning, organisms learn a response to terminate an aversive stimulus. In avoidance conditioning, the organism learns a response to avoid an aversive stimulus. According to two-factor theory, both classical and operant conditioning are involved in avoidance learning. Fear is created through classical conditioning, and this fear motivates escape and avoidance, which is then negatively reinforced by fear reduction.

Schedules of Reinforcement

On a continuous reinforcement schedule, every response is reinforced. Partial reinforcement may occur on a ratio schedule, in which a certain percentage of responses are reinforced, or on an interval schedule, in which a certain amount of time must pass before a response gets reinforced. In general, ratio schedules produce higher rates of performance than interval schedules.

On fixed ratio and interval schedules, reinforcement always occurs after a fixed number of correct responses or a fixed time interval. On variable schedules, the required number of responses or interval of time varies around some average.

Learning occurs most rapidly under continuous reinforcement, but partial schedules produce habits that are more resistant to extinction.




Share This Articles



***