Science and Human Behavior

CHAPTER XII

PUNISHMENT

A QUESTIONABLE TECHNIQUE

The commonest technique of control in modern life is punishment. The pattern is familiar: if a man does not behave as you wish, knock him down; if a child misbehaves, spank him; if the people of a country misbehave, bomb them. Legal and police systems are based upon such punishments as fines, flogging, incarceration, and hard labor. Religious control is exerted through penances, threats of excommunication, and consignment to hell-fire. Education has not wholly abandoned the birch rod. In everyday personal contact we control through censure, snubbing, disapproval, or banishment. In short, the degree to which we use punishment as a technique of control seems to be limited only by the degree to which we can gain the necessary power. All of this is done with the intention of reducing tendencies to behave in certain ways. Reinforcement builds up these tendencies; punishment is designed to tear them down.

The technique has often been analyzed, and many familiar questions continue to be asked. Must punishment be closely contingent upon the behavior punished? Must the individual know what he is being punished for? What forms of punishment are most effective and under what circumstances? This concern may be due to the realization that the technique has unfortunate by-products. In the long run, punishment, unlike reinforcement, works to the disadvantage of both the punished organism and the punishing agency. The aversive stimuli which are needed generate emotions, including predispositions to escape or retaliate, and disabling anxieties. For thousands of years men have asked whether the method could not be improved or whether some alternative practice would not be better.

DOES PUNISHMENT WORK?

More recently, the suspicion has also arisen that punishment does not in fact do what it is supposed to do. An immediate effect in reducing a tendency to behave is clear enough, but this may be misleading. The reduction in strength may not be permanent. An explicit revision in the theory of punishment may be dated by the changes in the theories of E. L. Thorndike. Thorndike’s first formulation of the behavior of his cats in a puzzle box appealed to two processes: the stamping in of rewarded behavior, or operant conditioning, and a converse process of stamping out as the effect of punishment. Thorndike’s later experiments with human subjects required a change in this formulation. The rewards and punishments he used were the relatively mild, verbal conditioned reinfoicers of “right” and “wrong.” Thorndike found that although “right” strengthened the behavior that preceded it, “wrong” did not weaken it. The relatively trivial nature of the punishment was probably an advantage, since the collateral effects of severe punishment could be avoided and the absence of a weakening effect could therefore be observed without interference from other processes.

The difference between immediate and long-term effects of punishment is clearly shown in animal experiments. In the process of extinction the organism emits a certain number of responses which can be reasonably well predicted. As we have seen, the rate is at first high and then falls off until no significant responding occurs. The cumulative extinction curve is one way of representing the net effect of reinforcement, an effect which we may describe as a predisposition to emit a certain number of responses without further reinforcement. If we now punish the first few responses emitted in extinction, the theory of punishment would lead us to expect that the rest of the extinction curve would contain fewer responses. If we could choose a punishment which subtracted the same number of responses as are added by a reinforcement, then fifty reinforced responses followed by twenty-five punished responses should leave an extinction curve characteristic of twenty-five reinforced responses. When a similar experiment was performed, however, it was found that although punishing responses at the beginning of an extinction curve reduced the momentary rate of responding, the rate rose again when punishment was discontinued and that eventually all responses came out. The effect of punishment was a temporary suppression of the behavior, not a reduction in the total number of responses. Even under severe and prolonged punishment, the rate of responding will rise when punishment has been discontinued, and although under these circumstances it is not easy to show that all the responses originally available will eventually appear, it has been found that after a given time the rate of responding is no lower than if no punishment had taken place.

The fact that punishment does not permanently reduce a tendency to respond is in agreement with Freud’s discovery of the surviving activity of what he called repressed wishes. As we shall see later, Freud’s observations can be brought into line with the present analysis.

THE EFFECTS OF PUNISHMENT

If punishment is not the opposite of reward, if it does not work by subtracting responses where reinforcement adds them, what does it do? We can answer this question with the help of our analysis of escape and of avoidance and anxiety. The answer supplies not only a clear-cut picture of the effect of punishment but an explanation of its unfortunate by-products. The analysis is somewhat detailed, but it is essential to the proper use of the technique and, to the therapy required to correct some of its consequences.

We must first define punishment without presupposing any effect. This may appear to be difficult. In defining a reinforcing stimulus we could avoid specifying physical characteristics by appealing to the effect upon the strength of the behavior. If a punishing consequence is also defined without reference to its physical characteristics and if there is no comparable effect to use as a touchstone, what course is open to us? The answer is as follows. We first define a positive reinforcer as any stimulus the presentation of which strengthens the behavior upon which it is made contingent. We define a negative reinforcer (an aversive stimulus) as any stimulus the withdrawal of which strengthens behavior. Both are reinforcers in the literal sense of reinforcing or strengthening a response. Insofar as scientific definition corresponds to lay usage, they are both “rewards.” In solving the problem of punishment, we simply ask: What is the effect of withdrawing a positive reinforcer or presenting a negative? An example of the former would be taking candy from a baby; an example of the latter, spanking a baby. We have not used any new terms in posing these questions and hence need not define any. Yet insofar as we are able to give a scientific definition of a lay term, these two possibilities appear to constitute the field of punishment. We do not presuppose any effect; we simply raise a question to be answered by appropriate experiments. The physical specifications of both kinds of consequences are determined in the case in which behavior is strengthened. Conditioned reinforcers, including the generalized reinforcers, fit the same definition: we punish by disapproving, by taking money away, as in a legal fine, and so on.

Although punishment is a powerful technique of social control, it is not necessarily administered by another individual. The burned child has been punished for touching flame. Eating unsuitable food is punished by indigestion. It is not necessary that the contingency represent an established functional relation, such as that between flames and burns or certain foods and indigestion. When a salesman in a midwestern city once approached a house and rang the doorbell, the rear of the house exploded. There was only an accidental and very rare contingency: gas had escaped into the kitchen, and the explosion was set off by sparks from the electric doorbell. The effect upon the subsequent behavior of the salesman as he rang other doorbells nevertheless falls within the present field.

A FIRST EFFECT OF PUNISHMENT

The first effect of the aversive stimuli used in punishment is confined to the immediate situation. It need not be followed by any change in behavior upon later occasions. When we stop a child from giggling in church by pinching it severely, the pinch elicits responses which are incompatible with laughing and powerful enough to suppress it. Although our action may have other consequences, we can single out the competing effect of the responses elicited by the punishing stimulus. The same effect is obtained with a conditioned stimulus when we stop the child with a threatening gesture. This requires earlier conditioning, but the current effect is simply the elicitation of incompatible behavior—the responses appropriate, for example, to fear. The formula can be extended to include emotional predispositions. Thus we may stop a man from running away by making him angry. The aversive stimulus which makes him angry may be unconditioned (for example, stamping on his toe) or conditioned (for example, calling him a coward). We may stop someone from eating his dinner by frightening him with a sudden deafening noise or a gruesome story.

It is not essential to this effect that the aversive stimulus be contingent upon behavior in the standard punishing sequence. When that sequence is observed, however, the effect still occurs and must be considered as one of the results of punishment. It resembles other effects of punishment in bringing undesirable behavior to an end; but since this is temporary, it is not likely to be accepted as typical of control through punishment.

A SECOND EFFECT OF PUNISHMENT

Punishment is generally supposed to have some abiding effect. It is hoped that some change in behavior will be observed in the future, even though further punishment is withheld. One enduring effect, also not often considered as typical, resembles the effect just considered. When a child who has been pinched for giggling starts to giggle upon a later occasion, his own behavior may supply conditioned stimuli which, like the mother’s threatening gesture, evoke opposed emotional responses. We have seen an adult parallel in the use of drugs which induce nausea or other aversive conditions as consequences of drinking alcoholic beverages. As a result later drinking generates conditioned aversive stimuli which evoke responses incompatible with further drinking. As an effect of the severe punishment of sexual behavior, the early stages of such behavior generate conditioned stimuli giving rise to emotional responses which interfere with the completion of the behavior. One difficulty with the technique is that punishment for sexual behavior may interfere with similar behavior under socially acceptable circumstances—for example, in marriage. In general, then, as a second effect of punishment, behavior which has consistently been punished becomes the source of conditioned stimuli which evoke incompatible behavior.

Some of this behavior involves glands and smooth muscles. Let us say, for example, that a child is consistently punished for lying. The behavior is not easily specified, since a verbal response is not necessarily in itself a lie but can be defined as such only by taking into account the circumstances under which it is emitted. These circumstances come to play a conspicuous role, however, so that the total situation stimulates the child in a characteristic fashion. For reasons which we shall examine in Chapter XVII, an individual is in general able to tell when he is lying. The stimuli to which he responds when he does so are conditioned to elicit responses appropriate to punishment: his palms may sweat, his pulse may speed up, and so on. When he later lies during a lie-detection test, these conditioned responses are recorded.

Strong emotional predispositions are also rearoused by the beginnings of severely punished behavior. These are the main ingredient of what we speak of as guilt, shame, or a sense of sin. Part of what we feel when we feel guilty are conditioned responses of glands and smooth muscles of the kind reported by the lie detector, but we may also recognize a displacement of the normal probabilities of our behavior. This is often the most conspicuous feature of the guilt of others. The furtive look, the skulking manner, the guilty way of speaking are emotional effects of the conditioned stimuli aroused by punished behavior. Comparable effects are observed in lower animals: the guilty behavior of a dog which is behaving in a way which has previously been punished is a familiar spectacle. A case may be easily set up in the laboratory. If a rat has been conditioned to press a lever by being reinforced with food and is then punished by being lightly shocked as it presses the lever, its behavior in approaching and touching the lever will be modified. The early stages in the sequence generate conditioned emotional stimuli which alter the behavior previously established. Since the punishment is not directly administered by another organism, the pattern does not resemble the more familiar behavior of guilt in the pet dog.

A condition of guilt or shame is generated not only by previously punished behavior but by any consistent external occasion for such behavior. The individual may feel guilty in a situation in which he has been punished. We gain control by introducing stimuli for just this effect. For example, if we punish a child for any behavior executed after we have said “No, no!” this verbal stimulus will later evoke an emotional state appropriate to punishment. When this policy has been followed consistently, the behavior of the child may be controlled simply by saying “No, no!” since the stimulus arouses an emotional condition which conflicts with the response to be controlled.

Although the rearousal of responses appropriate to aversive stimuli is again not the main effect of punishment, it works in the same direction. In none of these cases, however, have we supposed that the punished response is permanently weakened. It is merely temporarily suppressed, more or less effectively, by an emotional reaction.

A THIRD EFFECT OF PUNISHMENT

We come now to a much more important effect. If a given response is followed by an aversive stimulus, any stimulation which accompanies the response, whether it arises from the behavior itself or from concurrent circumstances, will be conditioned. We have just appealed to this formula in accounting for conditioned emotional reflexes and predispositions, but the same process also leads to the conditioning of aversive stimuli which serve as negative reinforcers. Any behavior which reduces this conditioned aversive stimulation will be reinforced. In the example just considered, as the rat approaches the lever to which its recent responses have been punished, powerful conditioned aversive stimuli are generated by the increasing proximity of the lever and by the rat’s own behavior of approach. Any behavior which reduces these stimuli—turning or running away, for example—is reinforced. Technically we may say that further punishment is avoided.

The most important effect of punishment, then, is to establish aversive conditions which are avoided by any behavior of “doing something else.” It is important—for both practical and theoretical reasons—to specify this behavior. It is not enough to say that what is strengthened is simply the opposite. Sometimes it is merely “doing nothing” in the form of actively holding still. Sometimes it is behavior appropriate to other current variables which are not, however, sufficient to explain the level of probability of the behavior without supposing that the individual is also acting “for the sake of keeping out of trouble.”

The effect of punishment in setting up behavior which competes with, and may displace, the punished response is most commonly described by saying that the individual represses the behavior, but we need not appeal to any activity which does not have the dimensions of behavior. If there is any repressing force or agent, it is simply the incompatible response. The individual contributes to the process by executing this response. (In Chapter XVIII we shall find that another sort of repression involves the individual’s knowledge of the repressed act.) No change in the strength of the punished response is implied.

If punishment is repeatedly avoided, the conditioned negative reinforcer undergoes extinction. Incompatible behavior is then less and less strongly reinforced, and the punished behavior eventually emerges. When punishment again occurs, the aversive stimuli are reconditioned, and the behavior of doing something else is then reinforced. If punishment is discontinued, the behavior may emerge in full strength.

When an individual is punished for not responding in a given way, conditioned aversive stimulation is generated when he is doing any thing else. Only by behaving in a given way may he become free of “guilt.” Thus one may avoid the aversive stimulation generated by “not doing one’s duty” by simply doing one’s duty. No moral or ethical problem is necessarily involved: a draft horse is kept moving according to the same formula. When the horse slows down, the slower pace (or the crack of a whip) supplies a conditioned aversive stimulus from which the horse escapes by increasing its speed. The aversive effect must be reinstated from time to time by actual contact with the whip.

Since punishment depends in large part upon the behavior of other people, it is likely to be intermittent. The action which is always punished is rare. All the schedules of reinforcement described in Chapter VI are presumably available.

SOME UNFORTUNATE BY-PRODUCTS OF PUNISHMENT

Severe punishment unquestionably has an immediate effect in reducing a tendency to act in a given way. This result is no doubt responsible for its widespread use. We “instinctively” attack anyone whose behavior displeases us—perhaps not in physical assault, but with criticism, disapproval, blame, or ridicule. Whether or not there is an inherited tendency to do this, the immediate effect of the practice is reinforcing enough to explain its currency. In the long run, however, punishment does not actually eliminate behavior from a repertoire, and its temporary achievement is obtained at tremendous cost in reducing the over-all efficiency and happiness of the group.

One by-product is a sort of conflict between the response which leads to punishment and the response which avoids it. These responses are incompatible and they are both likely to be strong at the same time. The repressing behavior generated by even severe and sustained punishment often has very little advantage over the behavior it represses. The result of such a conflict is discussed in Chapter XIV. When punishment is only intermittently administered, the conflict is especially troublesome, as we see in the case of the child who “does not know when he will be punished and when he will get away with it.” Responses which avoid punishment may alternate with punished responses in rapid oscillation or both may blend into an uncoordinated form. In the awkward, timorous, or “inhibited” person, standard behavior is interrupted by distracting responses, such as turning, stopping, and doing something else. The stutterer or stammerer shows a similar effect on a finer scale.

Another by-product of the use of punishment is even more unfortunate. Punished behavior is often strong, and certain incipient stages are therefore frequently reached. Even though the stimulation thus generated is successful in preventing a full-scale occurrence, it also evokes reflexes characteristic of fear, anxiety, and other emotions. Moreover, the incompatible behavior which blocks the punished response may resemble external physical restraint in generating rage or frustration. Since the variables responsible for these emotional patterns are generated by the organism itself, no appropriate escape behavior is available. The condition may be chronic and may result in “psychosomatic” illness or otherwise interfere with the effective behavior of the individual in his daily life (Chapter XXIV).

Perhaps the most troublesome result is obtained when the behavior punished is reflex—for example, weeping. Here it is usually not possible to execute “just the opposite,” since such behavior is not conditioned according to the operant formula. The repressing behavior must therefore work through a second stage, as in the operant control of “involuntary behavior” discussed in Chapter VI. Some examples will be considered in Chapter XXIV where the techniques of psychotherapy will be shown to be mainly concerned with the unfortunate by-products of punishment.

ALTERNATIVES TO PUNISHMENT

We may avoid the use of punishment by weakening an operant in other ways. Behavior which is conspicuously due to emotional circumstances, for example, is often likely to be punished, but it may often be more effectively controlled by modifying the circumstances. Changes brought about by satiation, too, often have the effect which is contemplated in the use of punishment. Behavior may often be eliminated from a repertoire, especially in young children, simply by allowing time to pass in accordance with a developmental schedule. If the behavior is largely a function of age, the child will, as we say, outgrow it. It is not always easy to put up with the behavior until this happens, especially under the conditions of the average household, but there is some consolation if we know that by carrying the child through a socially unacceptable stage we spare him the later complications arising from punishment.

Another way of weakening a conditioned response is simply to let time pass. This process of forgetting is not to be confused with extinction. Unfortunately it is generally slow and also requires that occasions for the behavior be avoided.

The most effective alternative process is probably extinction. This takes time but is much more rapid than allowing the response to be forgotten. The technique seems to be relatively free of objectionable by-products, We recommend it, for example, when we suggest that a parent “pay no attention” to objectionable behavior on the part of his child. If the child’s behavior is strong only because it has been reinforced by “getting a rise out of” the parent, it will disappear when this consequence is no longer forthcoming.

Another technique is to condition incompatible behavior, not by withdrawing censure or guilt, but through positive reinforcement. We use this method when we control a tendency toward emotional display by reinforcing stoical behavior. This is very different from punishing emotional behavior, even though the latter also provides for the indirect reinforcement of stoical behavior through a reduction in aversive stimuli. Direct positive reinforcement is to be preferred because it appears to have fewer objectionable by-products.

Civilized man has made some progress in turning from punishment to alternative forms of control. Avenging gods and hell-fire have given way to an emphasis upon heaven and the positive consequences of the good life. In agriculture and industry, fair wages are recognized as an improvement over slavery. The birch rod has made way for the reinforcements naturally accorded the educated man. Even in politics and government the power to punish has been supplemented by a more positive support of the behavior which conforms to the interests of the governing agency. But we are still a long way from exploiting the alternatives, and we are not likely to make any real advance so long as our information about punishment and the alternatives to punishment remains at the level of casual observation. As a consistent picture of the extremely complex consequences of punishment emerges from analytical research, we may gain the confidence and skill needed to design alternative procedures in the clinic, in education, in industry, in politics, and in other practical fields.