Louisa had been overheard to begin a conversation with her brother one day, by saying, “Tom, I wonder,” —upon which Mr. Gradgrind, who was the person overhearing, stepped into the light and said, “Louisa, never wonder.”
Herein lay the spring of the mechanical art and mystery of educating the reason, without stooping to the cultivation of the sentiments and affections. Never wonder. By means of addition, subtraction, multiplication, and division, settle everything somehow, and never wonder.
Charles Dickens, Hard Times
The strength of private enterprise lies in its terrifying simplicity … it fits perfectly into the modern trend towards total quantification at the expense of the appreciation of qualitative differences; for private enterprise is not concerned with what it produces but with what it gains from production.
E. F. Schumacher, Small Is Beautiful
Mia Kang stared at the test sheet on her desk.
It only was practice. Teachers call it a “field test” to give them an idea of how students will perform on the Texas Assessment of Knowledge and Skills.
But instead of filling in the bubbles and making her teacher happy, Mia, a freshman at MacArthur High School, used her answer sheet to write an essay that challenged standardized testing and using test scores to judge children and rank schools.
“I wrote about how standardized tests are hurting and not helping schools and kids,” said Mia, who looks and acts older than her fourteen years. “I just couldn’t participate in something that I’m completely opposed to.”
“These tests don’t measure what kids really need to know, they measure what’s easy to measure,” she said. “We should be learning concepts and skills, not just memorizing. It’s sad for kids and it’s sad for teachers, too.”
When the teaching and testing implications of No Child Left Behind Act of 2001 finally reached the classroom, there was a flurry of student resistance, of which Mia Kang’s brave stand was only a small example. Fifty-eight students at Danvers High School in Massachusetts signed a petition against being required to take the Massachusetts Comprehensive Assessment System (MCAS) exam, and those who refused to sit for the test were suspended from school. Students at other high schools in the state joined them. What might be called “elements of refusal” popped up throughout the country: large numbers of Michigan students opted out of the Michigan Educational Assessment Test, and Wisconsin’s high school “exit exam” (a condition of graduation) was scrapped owing to massive resistance from parents and students. In one case, teachers who resented the test drills now required of them protested by collectively refusing their own bonuses for superior performance. Protests against the tests required of early elementary pupils were organized on the pupils’ behalf by parents. While understanding the need to guarantee that children became literate and numerate early in their schooling, the parents objected to the “drill and kill” atmosphere in the classroom, as did their children.
A great deal, though not all, of the resistance was provoked by students who hated the “teaching to the test” drills that greatly raised the never negligible quotient of boredom in the classroom to new levels. The test preparation was not merely alienated labor for students and teachers alike, it crowded out the much of the time available for anything else—the arts, drama, history, sports, foreign languages, creative writing, poetry, field trips. Gone were many of the other goals that might animate education: cooperative learning, a multicultural curriculum, the fostering of multiple intelligences, discovery-oriented science, and problem-based learning.
The school was in danger of being transformed into a “one-product” factory, the product being students who could pass standardized tests designed to measure a narrow bandwidth of knowledge and test-taking skills. Here it is worth recalling once again that the modern institution of the school was invented at about the same time as the early textile factory. Each concentrated the workforce under one roof; each created time discipline and task specialization so as to facilitate supervision and evaluation; each aimed at producing a reliable, standardized product. The contemporary emphasis on regional or national standardized tests is based on the model of corporate management by quantitative norms, norms that allow comparisons across teachers, across schools, and across students so as to differentially reward them on the basis of their performance according to this criterion.
The question of the validity of the tests—whether they measure what they purport to measure—is in great doubt. That students can be trained to perform better by drills and by cramming makes it unclear what underlying knowledge or skills the tests measure. They have been shown to consistently underpredict the subsequent performance of women, of African Americans, and of pupils whose first language is not English. Above all, the alienation that high-stakes, test-driven education encourages threatens to give millions of youngsters a lifelong vaccination against school learning altogether.
Those most seemingly in favor of standardized tests as a management tool and a comparative measure of productivity are those at the greatest distance from ground zero of the classroom: superintendents of schools, city and state education officials, governors, and Department of Education policy makers. It gives them all an index, however invalid, of comparative productivity and a powerful incentive system to impose their pedagogical plans. It is most curious that the United States should elect to homogenize its educational system when most of the rest of the world is headed in the opposite direction. Finland, for example, has no external tests and no ranking of students or schools, but scores exceptionally well on all international measures of achievement. Many high-quality colleges and universities have stopped requiring or even encouraging students to take the nationally administered Scholastic Achievement Test (previously the Scholastic Aptitude Test). Nations that have historically relied on a single national examination to allocate precious places in universities have been rushing headlong to eliminate or deemphasize the tests in order to foster “creativity,” often in what they take to be an imitation of the American system!
Knowing that their fates and that of their schools depended on test scores each year, many educators not only drilled their students mercilessly but also cheated to ensure a successful outcome. Throughout the nation there was a nationwide epidemic of falsifying results. One of the most recent exposures was in Atlanta, Georgia, where forty-four of fifty-six schools investigated were found to have systematically forged student answers by erasing wrong answers and substituting the correct ones.1 The Superintendent of Schools, named National Superintendent of the Year in 2009 for her exceptional achievement in raising scores, was found to have created a climate of fear by giving teachers three years to meet targets or be fired. More than 180 educators were implicated in fixing the scores. Like the “brightest people in the room” at Enron, who always found a way to beat the quarterly targets and collect their bonuses, the educators in Atlanta found a way to meet their targets as well, but not in the way anticipated. The stakes were lower, but the collateral damage was equally devastating, and the logic of “gaming the system” was basically the same.
Would you please join me in a brief fantasy? The year is 2020. Richard Levin, president of Yale University, has just retired after a long and brilliant tenure and has declared “2020 The Year of Perfect Vision.” Every last building is rebuilt and shining, the students are even more precocious, accomplished, and unionized than they were in 2010, US News & World Report and Consumer Reports (now merged) have ranked Yale University number 1 across the board—up there with the very best hotels, luxury automobiles, and lawnmowers. Well, nearly across the board. It seems that the quality of the faculty, as reflected in the all-important rankings, has slipped. Yale’s competitors are shaking their heads at the decline. Those who know how to read between the lines of apparently serene “Yale Corporation” pronouncements can detect a rising but of course still decorous panic.
One sign of concern can be read from the selection of President Levin’s successor, Condoleezza Rice, the retired secretary of state, who most recently led a no-nonsense, business-like streamlining of the Ford Foundation. Yes, she is the first woman of color to lead Yale. Of course, four other Ivy League schools have already been headed by women of color. This is not surprising, inasmuch as Yale has always followed the New England farmer’s rule: “Never be the first person to try something new, nor the last.”
On the other hand, President Rice wasn’t chosen for the symbolism; she was chosen for the promise she represented: the promise of leading a thoroughgoing restructuring of the faculty using the most advanced quality management techniques, techniques perfected from their crude beginnings at the Grandes Écoles of Paris in the late nineteenth century; embodied in Robert McNamara’s revolution at Ford and later in his work at the Department of Defense in the 1960s, as well as in Margaret Thatcher’s managerial revolution in British social policy and higher education in the 1980s; refined by the development of numerical measures of productivity by individuals and units in industrial management; further developed by the World Bank; and brought to near perfection, so far as higher education is concerned, by the Big Ten universities and making their way, belatedly, to the Ivy League.
We know from confidential sources among the members of the Yale Corporation how Dr. Rice captivated them in her job interview. She said she admired the judicious mix of feudalism (in its politics) and capitalism (in its financial management) that Yale had managed to preserve. It suited perfectly the reforms she had devised—as did Yale’s long tradition of what has come to be celebrated as “participatory autocracy” in faculty governance.
But it was her comprehensive plan for massively improving the quality of the faculty—or, more accurately, improving its standing in the national rankings—that convinced the corporation that she was the answer to their prayers.
She excoriated Yale’s antiquated practices of hiring, promoting, and tenuring faculty. They were, she said, subjective, medieval, unsystematic, capricious, and arbitrary. These customs, jealously guarded by the aging—largely white male—mandarins of the faculty, whose average age now hovered around eighty, were, she claimed, responsible for Yale’s loss of ground to the competition. They produced, on the one hand, a driven, insecure junior faculty who had no way of knowing what the criteria of success and promotion were behind the tastes and prejudices of the seniors in their department and, on the other hand, a self-satisfied, unproductive oligarchy of gerontocrats heedless of the long-run interests of the institution.
Her plan, our sources tell us, was beguilingly simple. She proposed using the scientific techniques of quality evaluation employed elsewhere in the academy but implementing them, for the first time, in a truly comprehensive and transparent fashion. The scheme hinged on the citation indices: the Arts and Humanities Citation Index, the Social Science Citation Index, and, the granddaddy of them all, the Science Citation Index. To be sure, these counts of how often one’s work was cited by others in the field were already consulted from time to time in promotion reviews, but as President Rice, she proposed making this form of objective evaluation systematic and comprehensive. The citation indices, she stressed, like the machine counting of votes, play no favorites; they are incapable of conscious or unconscious bias; they represent the only impersonal metric for judgments of academic distinction. They would henceforth be the sole criterion for promotion and tenure. If she succeeded in breaking tenure, it would also serve as a basis for automatically dismissing tenured faculty whose sloth and dimness prevented them from achieving annual citation norms (ACN, for short).
In keeping with the neoliberal emphasis on transparency, full public disclosure, and objectivity, President Rice proposes a modern, high-tech, academic version of Robert Owens’s factory scheme at New Lanark. The entire faculty is to be outfitted with digitalized beanies. As soon as they are designed—in Yale’s distinctive blue-and-white color scheme—and can be manufactured under humane, nonsweatshop conditions, using no child labor, all faculty will be required to wear them on campus. The front of the beanie, across the forehead, will consist in a digital screen, rather like a taxi meter, on which will be displayed the total citation count of that scholar in real time. As the fully automated citation recording centers register new citations, these citations, conveyed by satellite, will be posted automatically to the digital readout on the beanie. Think of a miniature version of the constantly updated world population count once available in lights in Times Square. Let’s call it the Public Record of Digitally Underwritten Citation Totals, which produces the useful acronym PRODUCT. Rice conjures a vision of the thrill students will experience as they listen, rapt, to the lecture of a brilliant and renowned professor whose beanie, while she lectures, is constantly humming, the total citations piling up before their very eyes. Meanwhile, in a nearby classroom, students worry as they contemplate the blank readout on the beanie of the embarrassed professor before them. How will their transcript look when the cumulative citation total of all the professors from whom they have taken courses is compared with the cumulative total of their competitors for graduate or professional school? Have they studied with the best and brightest?
Students will no longer have to rely on the fallible hearsay evidence of their friends or the prejudices of a course critique. The numerical “quality grade” of their instructor will be there for all to see and to judge. Junior faculty no longer need fear the caprice of their senior colleagues. A single, indisputable standard of achievement will, like a batting average, provide a measure of quality and an unambiguous target for ambition. For President Rice, the system solves the perennial problem of how to reform departments that languish in the backwaters of their disciplines and become bastions of narrow patronage. This publicly accountable, transparent, impersonal measure of professional standing shall henceforth be used, in place of promotion and hiring committees.
Think of the clarity! A blue-ribbon panel of distinguished faculty (chosen by the new criterion) will simply establish several citation plateaus: one for renewal, one for promotion to term associate, one for tenure, and one for post-tenure performance. After that, the process will be entirely automated once the beanie technology is perfected. Imagine a much-quoted, pace-setting political science professor, Harvey Writealot, lecturing to a packed hall on campus. Suddenly, because an obscure scholar in Arizona has just quoted his last article in the Journal of Recent Recondite Research and, by chance, that very citation is the one that puts him over the top, the beanie instantly responds by flashing the good news in blue and white and playing “Boola-Boola.” The students, realizing what has happened, rise to applaud their professor’s elevation. He bows modestly, pleased and embarrassed by the fuss, and continues the lecture—but now with tenure. The console on the desk of President Rice’s office in Woodbridge Hall tells her that Harvey has made it” into the magic circle on his own merits, and she in turn sends him a message of congratulations broadcast through the beanie by text and voice. A new, distinctive “tenure beanie” and certificate will follow shortly.
Members of the corporation, understanding instantly how much time and disputation this automated system could save and how it could catapult Yale back into the faculty ratings chase, set about refining and perfecting the technique. One suggests having a time-lapse system of citation depreciation, each year’s citations losing one-eighth of their value with each passing year. An eight-year-old citation would evaporate, in keeping with the pace of field development. Reluctantly, one member of the corporation suggests that, for consistency, there be a minimal plateau for retention, even of previously tenured faculty. She acknowledges that the image of a bent professor’s citation total degrading to the dismissal level in the middle of a seminar is a sad spectacle to contemplate. Another suggests that the beanie in such a case could simply be programmed to go completely blank, though one imagines the professor could read his fate in the averted gaze of his students.
My poking fun at quantitative measures of productivity in the academy, however satisfying in its own right, is meant to serve a larger purpose. The point I wish to make is that democracies, particularly mass democracies like the United States that have embraced meritocratic criteria for elite selection and the distribution of public funds, are tempted to develop impersonal, objective, mechanical measures of quality. Regardless of the form they take: the Social Science Citation Index, the Scholastic Aptitude Test (renamed the Scholastic Assessment Test and, more recently, the Scholastic Reasoning Test), cost-benefit analysis—they all follow the same logic. Why? The short answer is that there are few social decisions as momentous for individuals and families as the distribution of life chances through education and employment or as momentous for communities and regions as the distribution of public funds for public works projects. The seductiveness of such measures is that they all turn measures of quality into measures of quantity, thereby allowing comparison across cases with an apparently single and impersonal metric. They are above all a vast and deceptive “antipolitics machine” designed to turn legitimate political questions into neutral, objective administrative exercises governed by experts. It is this depoliticizing sleight-of-hand that masks a deep lack of faith in the possibilities of mutuality and learning in politics so treasured by anarchists and democrats alike. Before arriving at “politics,” however, there two other potentially fatal objections to such techniques of quantitative commensuration.
The first and most obvious problem with such measures is that they are often invalid; that is, they rarely measure the quality we believe to be at stake with any accuracy.
The Science Citation Index (SCI), founded in 1963 and the granddaddy of all citation indices, was the brainchild of Eugene Garfield. Its purpose was to gauge, to measure the scientific impact of, say, a particular research paper, and by extension a particular scholar or research laboratory, by the frequency with which a published paper was cited by other research scientists. Why not? It sure beat relying on informal reputations, grants, the obscure embedded hierarchies of established institutions, let alone the sheer productivity of a scholar. More than half of all scientific publications, after all, seem to sink without a trace; they aren’t cited at all, not even once! Eighty percent are only cited once, ever. The SCI seemed to offer a neutral, accurate, transparent, disinterested, and objective measure of a scholar’s impact on subsequent scholarship. A blow for merit! And so it was, at least initially, compared to the structures of privilege and position it claimed to replace.
It was a great success, not least because it was heavily promoted; let’s not forget that this is a for-profit business! Soon it was pervasive: used in the award of tenure, to promote journals, to rank scholars and institutions, in technological analyses and government studies. Soon the Social Science Citation Index (SSCI) followed and, after that, could the Arts and Humanities Citation Index be far behind?
What precisely did the SCI measure? The first thing to notice is the computer-like mindlessness and abstraction of the data gathering. Self-citations counted, adding auto-eroticism to the normal narcissism that prevails in the academy. Negative citations, “X’s article is the worst piece of research I have ever encountered,” also count. Score one for X! As Mae West said, “There’s no such thing as bad publicity; just spell my name right!” Citations found in books, as opposed to articles, are not canvassed. More seriously, what if absolutely NO ONE EVER READS the articles in which a work was cited, as is often the case? Then there is the provincialism of the exercise; this is, after all a massively English-language, and hence Anglo-American, operation. Garfield claimed that the provincialism of French science could be seen in is failure to adopt English as the language of science. In the social sciences, this is preposterous on its face, but it is true that the translation and sale of your work to a hundred thousand Chinese, Brazilians, or Indonesian intellectuals will add nothing to your SSCI standing unless they record their gratitude in an English-language journal or in one of the handful of foreign language journals included in the magic circle.
Notice, too that the index must, as a statistical matter, favor the specialties that are the most heavily trafficked, that is to say mainstream research or, in Kuhn’s terms, “normal science.” Notice finally that the “objectified subjectivity” of the SSCI also is supremely presentist. What if a current line of inquiry is dropped as a sterile exercise three years hence? Today’s wave, and the statistical blip it creates, may still have allowed our lucky researcher to surf to a safe harbor despite her mistake. There is no need to belabor these shortcomings of the SSCI further. They serve only to show the inevitable gap between measures of this kind and the underlying quality they purport to assess. The sorry fact is, many of these shortcomings could be rectified by reforms and elaborations of the procedures by which the index is constructed. In practice, however, the more schematically abstract and computationally simple measure is preferred for its ease of use and, in this case, lower cost. But beneath the apparently objective metric of citations lies a long series of “accounting conventions” smuggled into measurements that are deeply political and deeply consequential.
My fun at the expense of the SSCI may seem a cheap shot. The argument I’m making, however, applies to any quantitative standard rigidly applied. Take the apparently reasonable “two-book” standard often applied in some departments at Yale in tenure decisions. How many scholars are there whose single book or article has generated more intellectual energy than the collected works of other, quantitatively far more “productive,” scholars? The commensurating device known as the “tape measure” may tell us that a Vermeer interior and a cow plop are both twenty inches across; there, however, the similarity ends.
The second fatal flaw is that even if the measure, when first devised, was a valid measure, its very existence typically sets in motion a train of events that undermines its validity. Let’s call this a process by which “a measure colonizes behavior,” thereby negating whatever validity it once had. Thus, I have been told there are “rings” of scholars who have agreed to cite one another routinely and thereby raise their citation rating! Outright conspiracy of this kind is but the most egregious version of a more important phenomenon. Simply knowing that the citation index can make or break a career exerts a not-so-subtle influence on professional conduct: for example, the gravitational pull of mainstream methodologies and populous subfields, the choice of journals, the incantation of a field’s most notable figures are all encouraged by the incentives thereby conjured. This is not necessarily crass Machiavellian behavior; I’m pointing instead to the constant pressure at the margin to act “prudently.” The result, in the long run, is a selection pressure, in the Darwinian sense, favoring the survival of those who meet or exceed their audit quotas.
A citation index is not merely an observation; it is a force in the world, capable of generating its own observations. Social theorists have been so struck by this colonization that they have attempted to give it a lawlike formulation in Goodhart’s law, which holds that “when a measure becomes a target it ceases to be a good measure.”2 And Matthew Light clarifies: “An authority sets some quantitative standard to measure a particular achievement; those responsible for meeting that standard, do so, but not in the way which was intended.”
A historical example will clarify what I mean. The officials of the French absolutist kings sought to tax their subjects’ houses according to size. They seized on the brilliant device of counting the windows and doors of a dwelling. At the beginning of the exercise, the number of windows and doors was a nearly perfect proxy for the size of a house. Over the next two centuries, however, the “window and door tax,” as it was called, impelled people to reconstruct and rebuild houses so as to minimize the number of apertures and thereby reduce the tax. One imagines generations of French choking in their poorly ventilated “tax shelters.” What started out as a valid measure became an invalid measure.
But this kind of policy is not limited to windows or prerevolutionary France. Indeed, similar methods of audit and quality control have come to dominate the educational system throughout much of the world. In the United States, the SAT has come to represent the technique of quantification that serves to distribute higher educational opportunities in an apparently objective fashion. We could just as easily take up the “exam hell” that dominates the gateway to university education, and thereby life chances, in any number of other countries.
Let’s just say that with respect to education, the SAT is not just the tail that wags the dog. It has reshaped the dog’s breed, its appetite, its surroundings, and the lives of all those who care for it and feed it. It’s a striking example of colonization. A set of powerful quantitative observations, once again, create something of a social Heisenberg Principle in which the scramble to make the grade utterly transforms the observational field. “Quantitative technologies work best,” Porter reminds us, “if the world they aim to describe can be remade in their own image.”3 It’s a fancy way of saying that the SAT has so reshaped education after its monochromatic image that what it observes is largely the effect of what it has itself conjured up.
Thus the desire to measure intellectual quality by standardized tests and to use those tests to distribute rewards to students, teachers, and schools has perverse colonizing effects. A veritable multi-million-dollar industry markets cram courses and techniques that purport to improve performance on tests that were said to be immune to such stratagems. Stanley Kaplan’s empire of test preparation courses and workbooks was built on the premise that one could learn to beat the test for college, law school, medical school, etc. The all-powerful audit criteria circle back, as it were, and colonize the lifeworld of education; the measurement replaces the quality it is supposed only to assess. There ensues something like an arms race in which the test formulators try to outwit the test preparation salesmen. The measurement ends by corrupting the desired substance or quality. Thus, once the “profile” of a successful applicant to an Ivy League school becomes known, the possibility of gaming the system arises. Education consultants are hired by wealthy parents to advise their children, with one eye on the Ivy League profile, about what extracurricular activities are desirable, what volunteer work might be advantageous, and so on. What began as a good faith exercise to make judgments of quality becomes, as parents try to “position” their children, a strategy. It becomes nearly impossible to assess the meaning or authenticity of such audit-corrupted behavior.
The desire for measures of performance that are quantitative, impersonal, and objective was, of course, integral to the management techniques brought from Ford Motor Company to the Pentagon by “whiz kid” Robert McNamara and applied to the war in Indochina. In a war without clearly demarcated battle lines, how could one gauge progress? McNamara told General Westmoreland, “General, show me a graph that will tell me whether we are winning or losing in Vietnam.” The result was at least two graphs: one, the most notorious, was an index of attrition, in which the “body counts” of confirmed enemy personnel killed in action were aggregated. Under enormous pressure to show progress, and knowing that the figures influenced promotions, decorations, and rest-and-recreation decisions, those who did the accounting made sure the body counts swelled. Any ambiguity between civilian and military casualties was elided; virtually all dead bodies became enemy military personnel. Soon, the total of enemy dead exceeded the known combined strength of the so-called Viet Cong and the North Vietnamese forces troop levels. Yet in the field, the enemy was anything but defeated.
The second index was an effort to take the measure of civilian sympathies in the campaign to Win Hearts and Minds—WHAM. The Hamlet Evaluation System was at its core: every one of South Vietnam’s 12,000 hamlets was classified according to an elaborate scheme as “pacified,” “contested,” or “hostile.” Pressure to show progress was again unrelenting. Ways were found: by fudging figures, by creating on paper self-defense militias that would have made Tsarina Catherine’s minister Grigory Potemkin proud, by statistically ignoring incidents of insurgent activity, in order to have the graph show improvement. Outright fraud, though not rare, was less common than the understandable tendency to resolve all ambiguities in the direction the incentives for a favorable evaluation and promotion led. Gradually, it seemed, the countryside was being pacified.
McNamara had created an infernal audit system that not only produced a mere simulacrum—a “command performance,” as it were—of legible progress but also blocked a wider-ranging dialogue about what might, under these circumstances, represent progress. They might have heeded a real scientist’s words, Einstein’s: “Not everything that counts can be counted and not everything that can be counted, counts.”
Finally, a more recent instance of this dynamic, with which many American investors have become sadly familiar, is furnished by the collapse of Enron Corporation. In the 1960s, business schools were preoccupied with the problem of how to “discipline” corporate managers so that they would not serve their own narrow interests at the expense of the interests of the company’s owners (aka shareholders). The solution they devised was to tie the compensation of senior management to business performance, as measured by shareholder value (aka share price). As their compensation in stock options depended, usually quarterly, on the share price, managers quickly responded by devising techniques in collaboration with their accountants and auditors to so cook the books that they would meet their quarterly share-price target and receive their bonuses. To boost the value of the company’s stock, they inflated profits and concealed losses so that others would be deceived into bidding up the share price. Thus, the attempt to make executive performance completely transparent by largely replacing salaries, given as a reward for labor and expertise, with stock option plans backfired. A similar “gaming logic” was at work in the bundling of mortgages into complex financial instruments implicated in the world financial collapse of 2008. Bond rating agencies, aside from being paid by bond issuers, had, in the interest of transparency, made their rating formulas available to investment firms. Knowing the procedures, or better yet hiring away the raters themselves, it became possible to reverse-engineer bonds with the formulas in mind and thereby achieve top ratings (AAA) for financial instruments that were exceptionally risky. Once again, the audit was successful but the patient died.
The great appeal of quantitative measures of quality arises, I believe, from two sources: a democratizing belief in equality of opportunity as opposed to inherited privilege, wealth, and entitlement, on the one hand, and a modernist conviction that merit can be scientifically measured on the other.
Applying scientific laws and quantitative measurement to most social problems would, modernists believed, eliminate sterile debates once the “facts” were known. This lens on the world has, built into it, a deeply embedded political agenda. There are, on this account, facts (usually numerical) that require no interpretation. Reliance on such facts should reduce the destructive play of narratives, sentiment, prejudices, habits, hyperbole, and emotion generally in public life. A cool, clinical, quantitative assessment would resolve disputes. Both the passions and the interests would be replaced by neutral, technical judgment. These scientific modernists aspired to minimize the distortions of subjectivity and partisan politics to achieve what Lorraine Daston has called “a-perspectival objectivity,” a view from nowhere.4 The political order most compatible with this view was the disinterested, impersonal rule of a technically educated elite using its scientific knowledge to regulate human affairs. This aspiration was seen as a new “civilizing project.” The reformist, cerebral Progressives in early twentieth-century American and, oddly enough, Lenin as well believed that objective scientific knowledge would allow the “administration of things” to largely replace politics. Their gospel of efficiency, technical training, and engineering solutions implied a world directed by a trained, rational, and professional managerial elite.
The idea of a meritocracy is the natural traveling companion of democracy and scientific modernism.5 No longer would a ruling class be an accident of noble birth, inherited wealth, or inherited status of any kind. Rulers would be selected, and hence legitimated, by virtue of their skills, intelligence, and demonstrated knowledge. (Here I pause to observe how other qualities one might plausibly want in positions of power, such as compassion, wisdom, courage, or breadth of experience, drop out of this account entirely.) Intelligence, by the standards of the time, was assumed by most of the educated public to be a measurable quality. Most assumed, furthermore, that intelligence was distributed, if not randomly, then at least far more widely than either wealth or title. The very idea of distributing, for the first time, position and life chances on the basis of measurable merit was a breath of democratic fresh air. It promised for society as a whole what Napoleon’s merit-based “careers open to talent” had promised the new professional middle class in France more than a century earlier.
Notions of a measurable meritocracy were democratic in still another sense: they severely curtailed the claims to discretionary power previously claimed by professional classes. Historically, the professions operated as trade guilds, setting their own standards, jealously guarding their professional secrets, and brooking no external scrutiny that would overrule their judgment. Lawyers, doctors, chartered accountants, engineers, and professors were hired for their professional judgment—a judgment that was often ineffable and opaque.
The mistakes made by a revolutionary workers movement are immeasurably more fruitful and more valuable than the infallibility of any party.
Rosa Luxemburg
The real damage of relying mainly on quantitatively measured merit and “objective” numerical audit systems to assess quality arises from taking vital questions that ought to be part of a vigorous democratic debate off the table and placing them in the hands of presumably neutral experts. It is this spurious depoliticization of momentous decisions affecting the life chances of millions of citizens and communities that deprives the public sphere of what legitimately belongs to it. If there is one conviction that anarchist thinkers and nondemagogic populists share, it is a faith in the capacity of a democratic citizenry to learn and grow through engagement in the public sphere. Just as we might ask what kind of person a particular office or factory routine produces, so might we want to ask how a political process might expand citizen knowledge and capacities. In this respect, the anarchist belief in mutuality without hierarchy and the capacity of ordinary citizens to learn through participation would deplore this short-circuiting of democratic debate. We can see the antipolitics machine at work in the uses of the Social Science Citation Index (SSCI), the Scholastic Aptitude Test (SAT), and in the now ubiquitous cost-benefit analysis.
The antipolitics of the SSCI consists in substituting a pseudo-scientific calculation for a healthy debate about quality. The real politics of a discipline—its worthy politics, anyway—is precisely the dialogue about standards of value and knowledge. I entertain few illusions about the typical quality of that dialogue. Are there interests and power relations at play? You bet. They’re ubiquitous. There is, however, no substitute for this necessarily qualitative and always-inconclusive discussion. It is the lifeblood of a discipline’s character, fought out in reviews, classrooms, roundtables, debates, and decisions about curriculum, hiring, and promotion. Any attempt to curtail that discussion by, for example, Balkanization into quasi-autonomous subfields, rigid quantitative standards, or elaborate scorecards tends simply to freeze a given orthodoxy or division of spoils in place.
The SAT system has, over the past half century, been opening and closing possible futures for millions of students. It has helped fashion an elite. Little wonder that that elite looks favorably on the system that helped it get to the front of the pack. It is just open enough, transparent enough, and impartial enough to allow elites and nonelites to regard it as a fair national competition for advancement. More than wealth or birth ever could, it allows the winners to see their reward as merited, although the correlations between SAT scores and socioeconomic status are enough to convince an impartial observer that this is no open door. The SAT, in effect, selected an elite that is more impartially chosen than its predecessors, more legitimate, and hence better situated to defend and reinforce the institution responsible for the naturalization of their excellence.
In the meantime, our political life is impoverished. The hold of the SAT convinces many middle-class whites that affirmative action is a stark choice between objective merit, on the one hand, and rank favoritism on the other. We are deprived of a public dialogue about how educational opportunity ought to be allocated in a democratic and plural society. We are deprived of a debate about what qualities we might want in our elites, individually in our schools, insofar as curricula simply echo the tunnel vision of the SAT.
An example drawn from a different field of public policy illustrates the way in which debatable assumptions are smuggled into the very structure of most audits and quantitative indices. Cost-benefit analysis, pioneered by the engineers of the French École des Ponts et Chausées and now applied by development agencies, planning bodies, the U.S. Army Corps of Engineers, and the World Bank for virtually all their initiatives, is a striking case in point. Cost-benefit analysis is a series of valuation techniques designed to calculate the rate of return for any given project (a road, a bridge, a dam, a port). This requires that all costs and returns be monetized so that they can be subsumed under the same metric. Thus the cost of, say, the loss of a species of fish, the loss of a beautiful view, of jobs, of clean air, if it is to enter the calculus, must be expressed in dollar terms. This requires some heroic assumptions. For the loss of a beautiful view, “shadow pricing” is used whereby residents are asked how much they would be willing to add to their taxes to preserve the view. The sum then becomes its value! If fishermen sold the fish extinguished by a dam, the loss of sales would represent their value. If they were not sold, then they would be valueless for the purpose of the analysis. Osprey, otters, and mergansers might be disappointed at the loss of their livelihood, but only human losses count. Losses that cannot be monetized cannot enter the analysis. When, say, an Indian tribe refuses compensation and declares that the graves of their ancestors, shortly to be flooded behind a dam, are “priceless beyond measure,” it defies the logic of cost-benefit analysis and falls out of the equation.
Everything, all costs and benefits, must be made commensurate and monetized in order to enter the calculations for the rate of return: a sunset view, trout, air quality, jobs, recreation, water quality. Perhaps the most heroic of the assumptions behind cost-benefit analysis is the value of the future. The question arises, how is one to calculate future benefits—say, a gradually improving water quality or future job gains? In general, the rule is that future benefits will be discounted at the current or average rate of interest. As a practical matter this means that virtually any benefit, unless massive, more than five years in the future will be negligible once discounted in this fashion. Here, then, is a critical political decision about the value of the future that is smuggled into the cost-benefit formula as a mere accounting convention. Quite apart from the manipulations to which cost-benefit analysis has always been subject, the great damage it does, even when rigorously applied, is its radical depoliticization of public decision making.
Porter attributes the adoption of audit systems of this kind in the United States to a “lack of trust in bureaucratic elites” and suggests that the United States “relies on rules to control the exercise of official judgment to a greater extent than any other industrialized democracy.”6 Thus, audit systems of this kind with the aim of achieving total objectivity by suppressing all discretion represent both the apotheosis of technocracy and its nemesis.
Each technique is an attempt to substitute a transparent, mechanical, explicit, and usually numerical procedure of evaluation for the suspect and apparently undemocratic practices of a professional elite. Each is a rich paradox from top to bottom, for the technique is also a response to political pressure: the desire of a clamorous public for procedures of decision and, in effect, rationing that are explicit, transparent, and, hence, in principle, accessible. Although cost-benefit analysis is a response to public political pressure—and here is one paradox—its success depends absolutely on appearing totally nonpolitical: objective, nonpartisan, and palpably scientific. Beneath this appearance, of course, cost-benefit analysis is deeply political. Its politics are buried deep in the techniques of calculation: in what to measure in the first place, in how to measure it, in what scale to use, in conventions of “discounting” and “commensuration,” in how observations are translated into numerical values, and in how these numerical values are used in decision making. While fending off charges of bias or favoritism, such techniques—and here is a second paradox—succeed brilliantly in entrenching a political agenda at the level of procedures and conventions of calculation that is doubly opaque and inaccessible.
When they are successful politically, the techniques of the SAT, cost-benefit analysis, and, for that matter, the Intelligence Quotient appear as solid, objective, and unquestionable as numbers for blood pressure, thermometer readings, cholesterol levels, and red blood cell counts. The readings are perfectly impersonal and, so far as their interpretation is concerned, “the doctor knows best.”
They seem to eliminate the capricious human element in decisions. Indeed, once the techniques with their deeply embedded and highly political assumptions are firmly in place, they do limit the discretion of officials. Charged with bias, the official can claim, with some truth, that “I am just cranking the handle”—of a nonpolitical decision-making machine. The vital protective cover such antipolitics machines provide helps explain why their validity is of less concern than their standardization, precision, and impartiality. Even if the SSCI does not measure the quality of a scholar’s work, even if the SAT doesn’t really measure intelligence or predict success in college, each constitutes an impartial, precise, public standard, a transparent set of rules and targets. When such tools succeed, they achieve the necessary alchemy of taking contentious and high-stakes battles for resources, life chances, mega-project benefits, and status and transmuting them into technical, apolitical decisions presided over by officials whose neutrality is beyond reproach. The criteria for decisions are explicit, standardized, and known in advance. Discretion and politics are made to disappear by techniques that are, at bottom, completely saturated with discretionary choices and political assumptions, now shielded effectively from public view.
The widespread use of numerical indices is not limited to any country, any branch of public policy, or indeed to the immediate present. Its current vogue in the form of the “audit society” obviously owes something to the rise of the large corporation, whose shareholders seek to measure productivity and results, and to the neoliberal politics of the 1970s and 1980s, as exemplified by Thatcher and Reagan, Their emphasis on “value for money” in public administration, borrowing techniques from management science in the private sector, sought to establish scores and “league tables” for schools, hospitals, police and fire departments, and so on. The deeper cause, however, is, paradoxically again, democratization and the demand for political control of administrative decisions. The United States seems to be something of an outlier in its embrace of audits and quantification. No other country has embraced audits in education, war-making, public works, and the compensation of business executives as enthusiastically as has the United States. Contrary to their self-image as a nation of rugged individualists, Americans are among the most normalized and monitored people in the world.
The great flaw of all these administrative techniques is that, in the name of equality and democracy, they function as a vast “antipolitics machine,” sweeping vast realms of legitimate public debate out of the public sphere and into the arms of technical, administrative committees. They stand in the way of potentially bracing and instructive debates about social policy, the meaning of intelligence, the selection of elites, the value of equity and diversity, and the purpose of economic growth and development. They are, in short, the means by which technical and administrative elites attempt to convince a skeptical public—while excluding that public from the debate—that they play no favorites, take no obscure discretionary action, and have no biases but are merely making transparent technical calculations. They are, today, the hallmark of a neoliberal political order in which the techniques of neoclassical economics have, in the name of scientific calculation and objectivity, come to replace other forms of reasoning.7 Whenever you hear someone say “I’m deeply invested in him/her” or refer to social or human “capital” or, so help me, refer to the “opportunity cost” of a human relationship, you’ll know what I’m talking about.