Chapter 8 The Logic and Design of the Survey Experiment An Autobiography of a Methodological Innovation

Paul M. Sniderman
The title promises a chapter about methods, so a confession is in order. Here, as everywhere, my concerns are substantive, not methodological. Still, what one wants to learn and how one ought to go about learning it are intertwined. So, I propose to bring out the logic of the survey experiment by presenting a classification of survey experiment designs. Specifically, I distinguish three designs: manipulative, permissive, and facilitative. The distinctions among the designs turn on the hypotheses being tested, not the operations performed, and, above all, on the role of predispositions. The first design aims to get people to do what they are not predisposed to do; the second to allow them to do what they are predisposed to do, without encouraging them; and the third to provide them with a relevant reason to do what they already are predisposed to do. Against the background of this threefold classification, I want to comment briefly on some issues of causal inference and external validity and then conclude by offering my own view on the reasons for the explosive growth in survey experiments in the study of public opinion.
The modern survey experiment is the biggest change in survey research in a half century. There is some interest in how it came about, I am told. So I begin by telling how I got the idea of computer-assisted survey experiments. I excuse this personal note partly because the editors requested it but, more importantly, because it allows me to acknowledge publicly the contributions of others.

1. Logic of Discovery

A year later, we took our children to spend a year in Toronto, living at my in-laws’ house, so that they would know their grandparents, and their grandparents would know them – not as children parachuted in from California for a brief stay, with their grandmother placing vats of candy by their bedsides, but as a family living together. It seemed like a good idea, and once again I learned the danger of good ideas. Our children were heartbroken at returning to California. There was an upside, however. Living with one's in-laws, however welcoming they are, is an out-of-equilibrium experience. I mention this only because it says something about the social psychology of discovery. I do not believe that I would have had the breakthrough idea about computer-assisted survey experiments were it not for the sharp and long break with everyday routine. Among other things, it allowed the past to catch up with the present.
As a child, I went to a progressive summer camp. After a day of games on land and water, we would be treated to a late-afternoon lecture in the rec hall on issues of social importance. One of the lessons that we were taught was that discrimination and prejudice are quite different things. Prejudice is how others feel about us (i.e., Jews), whereas discrimination is how others treat us. Although prejudice is a bad thing, how others feel about us is not nearly as important as how they treat us. Covenants against Jews buying property in “protected” areas, bans on membership in clubs, and quotas on university admission were the norm then.3 But between then and now, the memory of the lecture on the difference between prejudice and discrimination would regularly recur, and I would just as routinely be struck by the frustrating irony that I had enlisted in a vocation, survey research, that could study prejudice (attitudes) but not discrimination (action). That persisting frustration, I believe, was behind the idea that struck me on my walk with such force.
Voila! There was the broad answer to the summer camp lecture on the distinction between prejudice and discrimination. Ask a randomly selected set of respondents how much help the government should give to a white American who lost her job in finding another. Ask the others exactly the same question, except assume that it is a black American who has been laid off. If more white Americans back a claim to government assistance if the beneficiary is white, then we are capturing not only how they feel about black Americans but also how they treat them.
That was the idea – and I remember the street that I was on and the house that I was looking at when I had it. And absolutely nothing would have come of it but for Tom Piazza. Although Tom and I had seen each other around the halls of the SRC for years, the main thing we knew about each other is that we shared an interest in the analysis of racial attitudes (see Apostle et al. 1983). Blanche DuBois relied on the kindness of strangers. I have relied on their creativity and character. Tom was the one who made computer-assisted randomized experiments work. Every study that I have done since, we have done together, regardless of whether his name appeared on the project.
Childhood memories, disruption of routines, social science as a collaborative enterprise, technology as door opening, research centers as institutionalized sources of ecological serendipity – those are the themes of the first part of my story on the logic of discovery. The theme of the second part of my story is a variation on Robert Merton's (1973) classic characterization of the communist – his word – character of science.5
Tom and I had a monopoly position. Rather than take advantage of Merrill's breakthrough, the Institute for Social Research (ISR) at the University of Michigan and the National Opinion Research Center (NORC) at The University of Chicago attempted – for years – to write their own computer-assisted program. It was a bad decision for them because they failed. An ideal outcome for us, you might think. Only studies done through the Berkeley SRC could exploit the flexibility of computer-assisted interviewing in the design of randomized experiments, which meant that we would have no competition in conducting survey experiments for years into the future.
Merton was right about the communist character of science, however. We would succeed, but we would do so alone. And if we succeeded alone, we would fail. If other researchers could not play in our sandbox, then they would find another sandbox to play in. Our work would always be at the margins.
I had had my chance to come to bat in designing a study, actually two studies.7 The first article that we succeeded in publishing using randomized experiments gave me the idea. The article was built on the analysis of two experiments. Yet each experiment was only a question, admittedly a question that came in many forms, but at the end of the day, only a question, which is to say that the experiment took only about thirty seconds to administer. It then came to me that an interview of standard length could be used as a platform for multiple investigators. Each would have time for two to four experiments; each would have access to a common pool of right hand–side variables; each would be a principal investigator. If their experiments were a success, they would be a success. If not, they would have had a chance to swing at the ball.
This idea of a shared platform for independent studies was the second-best design idea on my score card. It made it possible for investigators, in the early stages of their careers, to do original survey research without having to raise the money.8 But how to identify who should have the chance? A large part of the motivation is that very few had had an opportunity to distinguish themselves through the design of original studies. My solution: I shamelessly solicited invitations to give talks at any university that would have me in order to identify a pool of possible participants. I then invited them to write a proposal, on the understanding that their idea was theirs alone, but that the responsibility for making the case to the National Science Foundation (NSF) was mine. My sales pitch was, “We will do thirteen studies for the price of one.” That was the birth of the Multi-Investigator Project. It is Karen Garret, the director of the project, who deserves the credit for making the studies a success.
I have one more personal note to add. The Multi-Investigator Project ran two waves. The day that I received the grant from NSF for the second wave, I made a decision. I should give up the project. Gatekeepers should be changed, I had always believed, and that applied to me, too. Diana Mutz and Arthur Lupia were the obvious choices. As the heads of Time-sharing Experiments in the Social Sciences (TESS), they transformed the Multi-Investigator Project. To get some order of the magnitude of the difference between the two platforms, think of the Multi-Investigator Project as a stagecoach and TESS as a Mercedes-Benz truck. Add the support of the NSF, particularly through the Political Science Program, the creativity of researchers, and the radical lowering of costs to entry through cooperative election studies, and survey experiments have become a standard tool in the study of public opinion and voting surveys. There is not a medal big enough to award Lupia and Mutz that would do justice to their achievements.
What is good fortune? Seeing an idea of yours travel the full arc, from being viewed at the outset as ridiculous to becoming in the end commonplace,9 and my sense of the idea has itself traveled an arc. Originally, I saw it as a tool to do one job. Gradually, I came to view it as a tool to do another.

2. A Design Classification

Manipulative Designs
Standardly, the distinction between observational and experimental designs parallels the distinction between those that are representing and intervening (Hacking 1983). Interventions or manipulations are the natural way to think of the treatment condition in an experiment. How does one test a vaccine?11 By intervening on a random basis, administering a vaccine to some patients and a placebo to others, and noting the difference in outcome between the two. Moreover, the equation of intervention and manipulation seemed all the more natural against the background understanding of public opinion a generation ago. Knowing and caring little about politics, the average citizen arranged her opinions higgledy-piggledy (the lack of constraint problem), even supposing that she had formed some in the first place (the nonattitudes problem), the reductio of this conception of public opinion being the claim that “most” people lacked attitudes on “most” issues, preferring instead “to make it up as they go along.”12 What, then, was the role of survey experiments? To demonstrate how easily one could get respondents to do what they were not predisposed to do.
The first generation of “framing” experiments is a poster child example of a manipulative design (e.g., Zaller 1992; Nelson and Kinder 1996). In one condition, a policy was framed in a way to evoke a positive response; in the other, the same policy was framed to evoke a negative response. And, would you believe, the policy enjoyed more support in the positive framing condition and evoked more opposition in the negative one? The substantive conclusion that was drawn was that the public was a marionette, and its strings could be pulled for or against a policy by controlling the frame. But this is to tell a story about politics with the politics left out. The parties and candidates battle over how policies should be framed, just as they battle over the positions that citizens should take on them.13 So Theriault and I (Sniderman and Theriault 2004) carried out a pair of experiments that replicated the positive and negative conditions of the first generation of framing experiments, but we added a third condition in which both frames were presented and a fourth in which neither appeared. The first two conditions replicated the findings of the first generation of framing experiments. But the third led to a quite different conclusion. Confronted with both frames in the experiment (as they typically would be in real life, if not simultaneously, then in close succession), rather than being confused and thrown off the tracks, respondents are better able to pick the policy alternative closest to their general view of the matter. Druckman (e.g., Druckman 2001a, 2001b, 2001c, 2004; Chong and Druckman 2007a, 2007b; Druckman et al. 2010) pried this small opening into a seminal series of studies on framing. In the areas in which I have research expertise, I am hard-pressed to think of another who has, step by step, progressively deepened our understanding of a focal problem.
Permissive Designs
A showpiece example of a permissive design in survey experiments is the List Experiment.14 The measurement problem is this: Can one create a set of circumstances in which a person being interviewed can express a potentially objectionable sentiment without the interviewer being aware that she has expressed it?15 Kuklinski's creative insight: to devise a question format that leads respondents to infer, correctly, that the interviewer cannot tell which responses they have made, but the data analyst can determine ex post the proportion of respondents making a particular response (see Kuklinski et al. 1997). To give a hypersimplified description of the procedure, in the baseline condition, the interviewer begins by saying, “I am going to read you a list of some things that make some people angry. I want you to tell me how many make you angry. Don't tell me which items make you angry. Just how many.” The interviewer then reads a list of, say, four items. In the test condition, everything is exactly the same, except that the list now has one more item, say, affirmative action for blacks. To determine the proportion of respondents angry over affirmative action, it is only necessary to subtract the mean angry responses in the baseline condition from the mean angry responses in the test condition, and then multiply by 100. Characteristics of respondents that increase (or decrease) the hit rate can be identified iteratively.
This type of design I baptize permissive because it allows respondents to respond without encouraging, inducing, or exerting pressure on them to do so. So it is with the List Experiment. Why do some respondents respond with a higher number in the treatment condition than in the baseline condition? Because they are predisposed to do so. They are angry over affirmative action and are being given the opportunity to express their anger – believing (correctly) that the interviewer has no way of knowing that they have done so – without realizing that a data analyst could deduce the proportion expressing anger ex post.
A second example of a permissive design comes from a celebrated series of studies on risk aversion by Tversky and Kahneman. They demonstrated that people have strikingly different preferences on two logically equivalent choices, depending on whether the choice is framed in terms of gains or losses. Their Asian Flu Experiment is a paradigmatic example. People are far more likely to favor exactly the same course of action if the choice alternatives are posed in terms of lives saved as opposed to lives lost. This result, labeled “risk aversion,” is highly robust. With an ingenious design, Druckman (2001a) carried out an experiment that had two arms: one matched the Kahneman-Tversky design, whereas the other added credible advice, in the form of endorsements of a course of action by political parties. The key finding: partisans take their cue from party endorsements, so much so that the gain–loss framing effect virtually disappears. I want to make two points with this example. First, framing effects are robustly found between choices that are logically equivalent, depending on whether the choices are framed in terms of gains or losses, in the absence of other information to exploit. Second, the observed effect is not a function of an experimental intervention in the form of an application of pressure on a respondent to react in a particular direction. It is instead a matter of allowing people to respond as they are predisposed without encouraging them to do so.
Facilitative Designs
The third type of design for survey experiments I christen facilitative. Permissive designs aim to allow respondents to do what they are predisposed to do without encouraging them. Manipulative designs aim to get people to do what they are not predisposed to do. Like permissive designs but unlike manipulative ones, facilitative designs do not involve the use of coercive or impelling force. Unlike permissive and manipulative designs, facilitative designs involve a directional force in the form of a relevant reason to do what people are already predisposed to do.
I have become persuaded that this notion of a relevant reason is a tip-off to a primary use of survey experiments for the study of public opinion. Let me illustrate what I mean by the notion of a relevant reason with an experiment designed by Laura Stoker (1998). The aim of this experiment is to determine the connection between support for a policy and the justification provided for it. Stoker picks affirmative action in its most provocative form – mandatory job quotas.
This in-your-face formulation policy frame should trigger the emotional logic that Converse (1964) argued underlies “reasoning” about racial policies in general. How one feels about blacks, he hypothesized, is the key to understanding why whites tend to line up on one or the other side of racial policies across the board. Feel negatively about blacks, and you will oppose policies to help them; feel positively, and you will support them. Stoker's (1998) experiment opens a new door on policy reasoning, though. It investigates the persuasive weight of two different reasons for mandatory quotas. Stoker's results show that one reason, the underrepresentation of blacks, counts as no reason at all – that is, there is no difference between deploying it as a justification and not deploying a justification at all. In contrast, the other reason, a finding of discrimination, counts as a relevant reason indeed – that is, it markedly increases support for affirmative action even framed in its most provocative form. Stoker's discovery is not the common-sense idea that policy justifications can make a difference. It is rather the differentiation of justifications that makes a difference. There is a world of difference between declaiming that fairness matters and specifying what counts as fairness.
As a second example of facilitation, consider the counter-argument technique. The counter-argument technique was introduced in Sniderman and Piazza (1993) and explored further in Sniderman et al. (1996). Gibson made it a central technique in the survey researchers’ toolkit, deploying it in a remarkably ambitious series of survey settings.17 The first generation of counter-arguments only comprises a quasi-experiment, however. The counter-argument presented to reconsider support for a policy is (naturally enough) different from the one presented to reconsider opposition to it, hence the relevance of the second generation of the counter-argument experiments (Jackman and Sniderman 2006). Respondents take a position on an issue and are then presented with a reason to reconsider. What should count as a reason to reconsider, one may reasonably ask, and what more exactly are people doing when they are reconsidering their initial position? Two content-laden counter-arguments are administered. One presents a substantive reason for respondents who have supported more government help to renounce this position, whereas the other provides a substantive reason for respondents who have opposed it to renounce their position. In addition, a content-free counter-argument – that is, an objection to the position that respondents have taken that has the form of an argument but not the specific substantive content of one18 – is also administered. Thus, half of the respondents initially supporting the policy get a content-laden counter-argument; half get a content-free one. Ditto for respondents initially opposing the policy.
There are two points I would make. The first is that respondents at all levels of political sophistication discriminate between a genuine reason (i.e., an argument that provides a substantive argument to reconsider) and a pseudo reason (i.e., an argument that merely points to the uncertainty of taking any position). Twice as many report changing their minds in response to a content-laden, rather than a content-free, counter-argument. There is, in short, a difference between getting an argument and getting argued with. The second point is that the bulk of those changing in the face of a content-laden counter-argument had taken a position at odds with their general view of the matter. What work, then, was the content-laden counter-argument doing? Most who change their initial position in response to a content-laden counter-argument had good reason to change. The side of the issue they had initially chosen was inconsistent with their general view of the matter.19 They were rethinking their initial position by dint of a reason that, from their point of view, should count as a reason to reconsider their position. In reconsidering, they were not changing their mind; rather, they were correcting a misstep. What, then, was the experimental intervention accomplishing? It was facilitating their reconsideration of the position they had taken in light of a consideration that counted as a relevant reason for reconsideration, given their own general view of the matter.

3. Experimental Treatments and Political Predispositions

In a pioneering analysis of the logic of survey experiments, Gaines, Kuklinski, and Quirk (2007) bring to the foreground a neglected consideration. Respondents do not enter public opinion interviews as blank slates. They bring with them the effects of previous experiences. Gaines et al. refer to the enduring effects of previous experience as pre-treatment. In their view, understanding how pre-treatments condition experimental responses is a precondition of understanding the logic of survey experiments. This is a dead-on-target insight. In my view, it is an understatement. The purpose of survey experiments in the study of public opinion is precisely to understand pretreatment – or, as I think of it, previous conditioning.
The Null Hypothesis
To bring out the logic of the problem, I enlist the SAT Experiment (Sniderman and Piazza 2002). African Americans have their own culture, it is claimed (Dawson 2001). Although there is a positive sense in which this claim may be true, there is also a negative sense in which it is false. The values of the American culture are as much the values of African Americans as of white Americans.21
To test this hypothesis of shared values, respondents, all of whom are black, are told of two young men, one black and the other white. Only one of the two can be admitted. The young white man's college entrance exam score is always 80; the young black man's exam score is (randomly) 55, 60, 65, 70, and 75.22 Respondents are asked which of the two young men should be admitted, if the college can admit only one.
Our hypothesis was that African Americans share the core values of the common culture. So far as they do, they should choose the young white man because he always has the higher exam score. On the other hand, it surely is a reasonable expectation that African Americans will take into account the continuing burden of discrimination. The question then is, how small does the difference in scores between the two young men need to be in order to be regarded as negligible for African Americans to give the nod to the black candidate on other grounds – for example, the fact that they have to overcome obstacles that whites do not. We worked to establish feet-in-cement expectations, recruiting a sample of experts to pick the point at which a majority of African Americans would favor the black candidate. Seventy-five percent of our experts picked a difference of just ten points to be so small as to wave away against the historic and continuing injustices done to blacks. And 100 percent of them predicted that a difference of only five points would be judged as insignificant. In fact, even when the difference in scores is smallest, the overwhelming number of African Americans picked the white candidate.23
The hypothesis is that African Americans share the core values of the American culture. If this is true, then they should overwhelmingly favor the candidate with the higher exam score, even if that always means favoring a white candidate over a black candidate. The null hypothesis, then, is that responses in the treatment and the control conditions should not differ. In fact, whether the difference between candidates’ SAT scores was large or small, they were equally likely to favor the high scorer – even though the high scorer in the experiment was always the white student. It is difficult for us to conceive of a more compelling demonstration of the commitment of African Americans to the value of achievement.24 Nor, at a lower rhetorical register, to imagine a better example of an absence of a treatment effect being evidence for a substantive hypothesis.
Interactions
The question that the Laid-Off Worker Experiment was designed to investigate was whether political conservatives discriminate against African Americans. Are they as willing to honor a claim for government assistance made by a white American as they are one made by a black American? But framing the question broadly obscures the real question, we reasoned. Supposing that being black made a difference to conservatives, what is it about being black that makes a difference? Three stigmatizing characterizations of blacks stood out: “lazy” blacks, unmarried black mothers, and young black (stereotypically aggressive) males. Accordingly, in the Laid-Off Worker Experiment, respondents are told about a person who has lost her job and asked how much help the government should give her in finding another. Naturally, the race of the person who has been laid off randomly varied. But so, too, did the gender, age, marital-parental status, and work history (dependable vs. undependable).
When the data were analyzed, what should pop up but the finding that political conservatives are more, not less, likely to favor government assistance for a black worker who has lost her job than a white worker. “Pop up” is not a scientific term, I recognize. But it would be a scam to imply that we anticipated that conservatives would go all out for out-of-work blacks. We had a reasonable expectation that conservatives would be harder on blacks than on whites. We never expected to find that they would respond with more sympathy and more support for a black who had lost her job than for a white who similarly found herself on the street. Nor had anyone else. The result would discredit the whole idea of using randomized experiments in public opinion, I feared. Days of frenzied analysis followed. On the fourth day, Tom Piazza and I solved the puzzle. It was not blacks in general that evoked an especially supportive response from political conservatives: it was hard-working blacks distinctively. And why did conservatives respond to a hard-working black? Precisely because, for them, a hard-working black was the exception, so they wanted to make an exception for them by having the government help them find another job. So we argued in our initial study, and so we cross-validated in a follow-up.25
From this experience, I draw two methodological lessons. We designed the experiment to test the hypothesis that conservatives racially discriminate (and would have had a blessed-on-all-sides career had the Laid-Off Worker Experiment done the job that we believed it would do). The result was nothing like we anticipated. And that is the first methodological lesson. Surprise is a cognitive emotion. And just because the design of experiments requires a definition of expectations, experiments can surprise in a way that observational analysis cannot. Hypotheses precede experiments rather than the other way around, which is the reason that each is designed the way it is. The second methodological point I would make is that the expression “split-half” should be banished. The presumption that survey experiments can have only two conditions has handcuffed survey experimenters. Complexity is not a value in and of itself. To say that an experiment has the right design is to say that it is set up in the right way to answer the question it is designed to answer. And computer-assisted surveys are a breakthrough, among other respects, because of the plasticity of the designs that they permit.
Survey Experiments and Counterfactual Conditionals: Majorities and Counter-Majorities under the Same Equilibrium Conditions
I argue that the principal business of survey experiments is to reveal what people are already predisposed to do. Ironically, this means that they can put us in a position to explore possible worlds, an example of which will make clear what I have in mind.
With Ted Carmines, I investigated a hypothesis about the potential for a breakthrough in public support for policies to assist blacks. Researchers of symbolic racism maintain that racial prejudice has a death grip on the American mind. In their view, for the grip of racism to weaken, nothing less than a change in the hearts and minds of white Americans was necessary. In contrast, we believed that there was a political opening. Revive the moral universalism of the civil rights movement, we reasoned, and a winning coalition of whites and blacks could be brought into existence.
To test this conjecture, we carried out a pair of experiments, the Regardless of Race Experiment and the Color Blind Experiment (Sniderman and Carmines 1997).26 Both experiments showed that support for policies that would help blacks is markedly higher if the arguments made on their behalf are morally universalistic, rather than racially particularistic. To be sure, conservatives are no more likely to support the policy when a universalistic appeal is made on its behalf than when a particularistic one is. But then again, why should they? They are being asked to support a liberal policy. Consistent with our hypothesis, moderates are markedly more likely to support the policy in the face of a universalistic, rather than a particularistic, appeal. Still more telling, so, too, are liberals.
This result illustrates a general point about politics and a specific one about racial politics. The general point is this: in politics, more than one winning coalition can exist under the same equilibrium conditions. There is the majority that one observes, conditional on the available political alternatives. But there are the counter-majorities that one would observe, conditional on different alternatives or different reasons for choosing between the same alternatives. This claim of multiple majorities under the same equilibrium conditions goes further than the standard interpretation of Riker's (1996) heresthetics. His claim is that bringing about a new winning coalition requires bringing a new dimension of cleavage to the fore. Thanks to experiments opening up the exploration of possible worlds, one can see how a new winning coalition can be brought about without bringing a new dimension of cleavage to the fore.
The second point has to do with the politics of race. Many race specialists in political science have nailed their flag to the claim that in order to establish a new majority on the issue of race, a change in the politics of race requires a change in the core values of Americans. In contrast, our claim was that it was not necessary to change the hearts and minds of white Americans in order to change the politics of race. A counter-majority ready to support a politics of race that was morally universalistic was in existence and already in position. It would be brought to the surface when a politician was ambitious and clever enough to mobilize it. It would go too far to say that our analysis predicted the Obama victory.27 It does not go too far to say that that it is the only analysis of race and American politics that is consistent with it.

4. A Final View

The experimental method has made inroads on many fronts in political science, but why have survey experiments met with earlier and broader acceptance? Part of the answer to this question is straightforward. Survey experiments (and, when I say survey experiments, I include the whole family of interviewing modes, from face-to-face to telephone to web-based modes) have a lower hurdle to jump in meeting requirements of external validity. Lower does not mean low, I would hastily add. A second part of the answer for the explosive growth in survey experiments is similarly straightforward. Research areas flourish in inverse proportion to barriers to entry. With the introduction of the Multi-Investigator Project studies and then the enormous advance of TESS as a platform for survey experiments, the marginal cost of conducting survey experiments plummeted. Cooperative election studies, providing teams of investigators the time to carry out autonomously designed studies, have become the third stage of this cost revolution.
The importance of these two factors should not be underestimated, but a third factor is even more important, in my opinion. When it comes to survey experiments as a method for the study of politics, the “what” that is being studied has driven the “how” it is studied, rather than the other way around. It is the power of the ideas of generations of researchers in the study of public opinion and voting, incorporating theoretical frameworks from the social psychological to the rational, that has provided the propulsive force in the use of survey experiments in the study of mass politics.

References

Apostle, Richard A., Charles Y. Glock, Thomas Piazza, and Marijane Suelze. 1983. The Anatomy of Racial Attitudes. Berkeley: University of California Press.
Bartels, Larry, and Henry E. Brady. 1993. “The State of Quantitative Political Methodology.” In The State of the Discipline II, ed. Ada Finifter. Washington, DC: American Political Science Association, 121–62.
Chong, Dennis, and James N. Druckman. 2007a. “Framing Public Opinion in Competitive Democracies.” American Political Science Review 101: 637–55.
Chong, Dennis, and James N. Druckman. 2007b. “Framing Theory.” Annual Review of Political Science 10: 103–26.
Converse, Philip E. 1964. “The Nature of Belief Systems in Mass Publics.” In Ideology and Discontent, ed. David E. Apter. New York: Free Press, 206–61.
Dawson, Michael C. 2001. Black Visions. Chicago: The University of Chicago Press.
Druckman, James N. 2001a. “Using Credible Advice to Overcome Framing Effects.” Journal of Law, Economics, & Organization 17: 62–82.
Druckman, James N. 2001b. “On the Limits of Framing Effects.” Journal of Politics 63: 1041–66.
Druckman, James N. 2001c. “The Implications of Framing Effects for Citizen Competence.” Political Behavior 23: 225–56.
Druckman, James N. 2004. “Political Preference Formation.” American Political Science Review 98: 671–86.
Druckman, James N., Cari Lynn Hennessy, Kristi St. Charles, and Jonathan Weber. 2010. “Competing Rhetoric over Time: Frames versus Cues.” Journal of Politics 72: 136–48.
Freedman, David A., Roger Pisani, and Roger Purves. 1998. Statistics. New York: Norton, 3–11.
Gaines, Brian J., James H. Kuklinski, and Paul J. Quirk. 2007. “The Logic of the Survey Experiment Reexamined.” Political Analysis 15: 1–20.
Gibson, James L., and Amanda Gouws. 2003. Overcoming Intolerance in South Africa: Experiments in Democratic Persuasion. New York: Cambridge University Press.
Hacking, Ian. 1983. Representing and Intervening: Introductory Topics in the Philosophy of Natural Science. New York: Cambridge University Press.
Hurwitz, Jon, and Mark Peffley.1998. Perception and Prejudice: Race and Politics in the United States. New Haven, CT: Yale University Press.
Jackman, Simon, and Paul M. Sniderman. 2006. “The Limits of Deliberative Discussion: A Model of Everyday Political Arguments.” Journal of Politics 68: 272–83.
Kuklinski, James H., Paul M. Sniderman, Kathleen Knight, Thomas Piazza, Philiip E. Tetlock, Gordon R. Lawrence, and Barbara Mellers. 1997. “Racial Attitudes Toward Affirmative Action.” American Journal of Political Science 41: 402–19.
Merton, Robert K. 1973. The Sociology of Science. Chicago: The University of Chicago Press.
Nelson, Thomas E., and Donald R. Kinder. 1996. “Issue Frames and Group-Centrism in American Public Opinion.” Journal of Politics 58: 1055–78.
Riker, William H. 1996. The Strategy of Rhetoric. New Haven, CT: Yale University Press.
Schuman, Howard, and Stanley Presser. 1981. Questions and Answers in Attitude Surveys. New York: Academic Press.
Sniderman, Paul M., and Edward G. Carmines. 1997. Reaching beyond Race. Cambridge, MA: Harvard University Press.
Sniderman, Paul M., and Thomas Piazza. 1993. The Scar of Race. Cambridge, MA: Harvard University Press.
Sniderman, Paul M., and Thomas Piazza. 2002. Black Pride and Black Prejudice. Princeton, NJ: Princeton University Press.
Sniderman, Paul M., Philip E. Tetlock, and Laurel Elms. 2001. “Public Opinion and Democratic Politics: The Problem of Non-attitudes and Social Construction of Political Judgment.” In Citizens and Politics: Perspectives from Political Psychology, ed. James H. Kuklinski. New York: Cambridge University Press, 254–84.
Sniderman, Paul M., and Sean M. Theriault. 2004. “The Structure of Political Argument and the Logic of Issue Framing.” In Studies in Public Opinion, eds. Willem Saris and Paul M. Sniderman. Princeton, NJ: Princeton University Press, 133–65.
Stoker, Laura. 1998. “Understanding Whites’ Resistance to Affirmative Action: The Role of Principled Commitments and Racial Prejudice.” In Perception and Prejudice: Race and Politics in the United States, eds. Jon Hurwitz and Mark Peffley. New Haven, CT: Yale University Press, 135–70.
Webb, Eugene, J., Donald T. Campbell, Richard D. Schwartz, and Lee Sechrest. 1996. Unobtrusive Measures: Nonreactive Research in the Social Sciences. New York: Rand McNally.
Zaller, John R. 1992. The Nature and Origins of Mass Opinion. New York: Cambridge University Press.
1 Schuman and Presser (1981) is the seminal work.
2 It was an exceptional achievement. ISR at Michigan and NORC in Chicago, the two heavyweight champions of academic survey research, gave years and a treasure chest of man-hours attempting to match the programming achievement of Merrill and his colleagues, only to fail.
3 My father and father-in-law were among the first Jews permitted to attend the University of Toronto Medical School. My wife was a member of the first class of the University of Toronto Medical School in which the Jewish quota was lifted.
4 The contrast is with the then-common practice of asking a series of items, varying the beneficiary of a policy (i.e., would you favor the program if it benefited a white American?, if it benefited a black American?). I am also presuming, when I speak of the procedure being invisible to the respondent, the artful writing of an item.
5 When referring to communism, Merton (1973) meant the principle of common ownership of scientific discoveries. We do not have a right to the means to make scientific discoveries, but we do have a right to share in them. And those who make them have a corresponding duty to allow us to share in them.
6 For an overview of how much progress was made on how many fronts, see Bartels and Brady (1993).
7 The first was the Bay Area Survey with Thomas Piazza, which led to Sniderman and Piazza (1993). The second was The Charter of Rights Study, which led to Sniderman et al. (1996). The third was the National Race and Politics (RAP) Study, which led to, among many other publications, Sniderman and Carmines (1997) and Hurwitz and Peffley (1998). The RAP was the trial run for the Multi-Investigator Project, and involved eight coprincipal investigators.
8 One of the benefits I did not anticipate was that, even if their first try had not succeeded, they had a leg up in writing a proposal for a full-scale study.
9 My first proposal to the NSF to do survey experiments was judged by two of the reviewers to be a farcical undertaking, one of whom took eight pages to make sure that his opinion of the project was clear.
10 The classification hinges on the aims of experiments. Because I know the hypotheses that experiments I have designed were designed to test, I (over)illustrate the principles, using examples of experiments that my colleagues and I have conducted.
11 See Freedman, Pisani, and Purves (1998), who offer the Salk vaccine test as a paradigm example of randomized experiment.
12 For a detailed critique of this view of public opinion, see Sniderman, Tetlock, and Elms (2001).
13 The idea of dual frames – or, to use Chong and Druckman's (2007a) term, competitive frames – came to me while watching a Democratic campaign ad on television framing an issue to its advantage, immediately followed by a Republican ad framing the same issue to its advantage.
14 Here is one of the few times when I know for certain where an idea came from. I myself was a witness at the creation of List Experiment. During a planning session for the 1990 Race and Politics Study at the Circle 7 Ranch, I took Jim Kuklinski for a Jeep ride in the meadow. Suddenly, by the front gate, he stood up, exclaimed the equivalent of “Eureka,” and outlined the design of the List Experiment. I mention this for two reasons: 1) to put on record that Kuklinski devised the List Experiment, easily the most widely used survey experiment design, and 2) to offer an historical example of the creativity of multiinvestigator studies – the National Race and Politics Study had nine coprincipal investigators and contributed more innovations than any previous study because of its power for innovation.
15 There is another possibility, and a more likely one in my view. They do not want to say openly that affirmative action makes them angry because doing so conflicts with their sense of self and their political principles; that is, it violates a principle or image of themselves that they value (see Sniderman and Carmines 1997).
16 Again, by way of underlining the decisive difference between the straight jacket or the split ballot design and the plasticity of the computer-assisted interviewing, I would underline that the actual design of List Experiments tends to involve a number of test conditions, allowing for the comparison and contrast of, say, responses to African Americans becoming neighbors and asking for affirmative action.
17 For an especially fascinating example, see Gibson and Gouws (2003).
18 The wording of the content-free counter-argument in this study is, “However, if one thinks of all the problems this is going to create.…”
19 Treatment and control groups were thus identically positioned. Analysis searching for asymmetric effects conditional on being pro or con the policy failed to detect any.
20 This is a costly view. Among other things, it produces a publication bias of experiments being regarded as succeeding when they produce differences and failing when they do not.
21 This is an example of a descriptive as opposed to a causal hypothesis, although it should not be assigned second-class status on this account. The former is capable of being as enlightening as the latter, and better grounded by far.
22 Their social class (in the form of their father's occupation) is also randomly varied.
23 As a test of social desirability, we examined separately respondents interviewed by black interviewers, and they were even more likely to hew to the value of achievement than those interviewed by white interviewers.
24 I am curious how many want to bet that white Americans would show a similar measure of commitment to the value of achievement in an equivalent situation.
25 See the Helping Hand Experiment (Sniderman and Carmines 1997).
26 Designing experiments in pairs provides invaluable opportunities for replication in the same study.
27 We had in mind an ambitious and gifted politician such as President Clinton, but, alas, Monica Lewinsky prevented a test of our hypothesis. It never entered our heads that the country had so progressed that an African American could do so.