2
Bottomless Knowledge
BENNETT CERF BECAME an early television celebrity not because he was the publisher and co-founder of Random House but because he was dapper in his bow tie and had a seemingly infinite supply of anecdotes. Here’s one from a collection he published in 1943:
Cass Canfield of Harper’s was approached one day in his editorial sanctum by a sweet-faced but determined matron who wanted very much to discuss a first novel on which she was working. “How long should a novel be?” she demanded.
“That’s an impossible question to answer,” explained Canfield. “Some novels, like Ethan Frome, are only about 40,000 words long. Others, Gone with the Wind, for instance, may run to 300,000.”
“But what is the average length of the ordinary novel?” the lady persisted.
“Oh, I’d say about 80,000 words,” said Canfield.
The lady jumped to her feet with a cry of triumph. “Thank God!” she cried. “My book is finished!”
1
Of course, she got it wrong. But her strategy was impeccable. She asked a genuine expert, got a correct answer, and reached a decision. And, most important, she could stop asking.
The system worked.
And it works not just for foolish women who in the 1940s served as the butt of our jokes. Faced with the fact that there is too much to know, our strategy has been to build a system of stopping points for knowledge. It’s an efficient response, well-suited to the paper medium by which we preserved and communicated knowledge.
Let’s walk it backward.
It’s 1983. You want to know the population of Pittsburgh, so instead of waiting six years for the Web to be invented, you head to the library. The card catalog leads you to an almanac, and the almanac’s index points you to the needle of fact in a thousand-page haystack. “Aha, Pittsburgh’s population is 2,219,000,” you say to yourself, writing it down so you’ll remember. The almanac publisher got the information from the US Census. The Census Department sent out hundreds of thousands of folks to knock on doors. They of course had to be trained, and before that a system had to be created for collecting and processing the information they sought. The most recent Census cost $6.9 billion just for 2010, not including the other nine years of operations,
2 or about $20 per American. The almanac cost the library $12.95. The facts in the almanac cost the library less than a penny each.
The economics of knowledge make sense only if, after looking up the population of Pittsburgh in the almanac, people stop looking. If everyone were to say “Well, that may be a pretty good guess, but I can’t trust it,” and then hire their own census takers to recount the citizens of Pittsburgh, the cost of knowledge would be astronomical. Distrust is an expensive vice.
We have been right to trust almanacs even without investigating how they ensure the reliability of their information. We’ve presumed that the almanac’s editors have collected its information carefully and have processes in place to make sure that it’s accurate. If someone were to challenge your assertion about population data, saying “I got it out of the latest almanac” would probably end the contest. The almanac’s presumed authority has stopped the argument. The system has worked again.
Of course, if someone you trusted said that you shouldn’t have used Bob’s Guesswork Almanac because it’s shoddily put together and full of typos, you might check a different one. And if human lives or large amounts of money depended on the absolute precision of the answer—that is, if the consequences of being wrong were high enough—you might track down the original data from the US Census, or even hire your own census takers. But short of this, you accept the almanac’s answer because the very fact that it was professionally published and stocked in your library serves as a credential attesting to its reliability. Such credentials put the “stop” into stopping points. Just as we as a species can’t afford to investigate every fact down to its origins, we can’t afford to investigate every credential. So, knowledge has been a system of stopping points justified by a series of stopping points. And for the most part, that works very well, especially since the system is constructed so that generally you can proceed to get more information if necessary: You can follow the footnotes, or check the population figures against a second source, without the cost and expense of commissioning your own phalanx of census-takers.
Our system of knowledge is a clever adaptation to the fact that our environment is too big to be known by any one person. A species that gets answers and can then stop asking is able to free itself for new inquiries. It will build pyramids and eventually large hadron colliders and Oreos. This strategy is perfectly adapted to paper-based knowledge. Books are designed to contain all the information required to stop inquiries within the book’s topic. But now that our medium can handle far more ideas and information, and now that it is a connective medium (ideas to ideas, people to ideas, people to people), our strategy is changing. And that is changing the very shape of knowledge.
A History of Facts
In 1954, there were so many polio cases in Boston that Children’s Hospital had to perform sidewalk triage before an audience of distressed parents seated in idling cars. So, when Jonas Salk’s polio vaccine was successfully tested in 1955, he became a hero to an entire generation. But producing the Salk vaccine depended on an earlier breakthrough by John Enders, who with his colleagues discovered in 1948 how to grow the polio virus outside the human body. At the time, viruses were invisible even to the most powerful microscopes. The only way Enders’s team could tell if they’d succeeded in growing the polio virus was to inject it into the brain of a monkey and see if it came down with the disease’s awful symptoms. Enders’s technique made it possible for Salk to develop the vaccine that earned him a Nobel Prize. Enders’s work got him one of the prizes as well, in 1954, even before Salk’s vaccine had been proven effective.
3
The chain of knowledge that produced Salk’s vaccine followed the most up-to-date medical processes of the day. Yet, in one important way, the path their breakthroughs took was no different from our most ancient way of proceeding. For example, just as surely as we know about viruses and their enemies, the ancients knew that people came in four flavors or “humors”: sanguine, choleric, melancholic, and phlegmatic. Each humor was part of a complex conceptual system of organs, bodily fluids, seasons of the year, astrological signs, and treatments. If your humors got out of whack, you might find yourself sent to the local barber for a helpful bloodletting or a purge. (Don’t ask.) This knowledge of how the body worked, and how it integrated into its environment, was believed by the Egyptians, Greeks, and Romans, by Muslims, Christians, Jews, and “pagans.” For close to 2,000 years, we humans knew the humors were real and extremely important.
Of course, we were wrong. Your bile doesn’t depend on your astrological sign and your liver doesn’t depend on your personality. But even though we now thoroughly reject the idea of the humors, Galen of Pergamum, Enders of Connecticut, and Salk of New York, born across a stretch of 1,800 years, all accepted and assumed that knowledge works essentially the same way: Knowledge is a structure built on a firm foundation that lets us securely add new pieces—polio myelitis makes sense if you already know about viruses and immune systems. Because we can lay firm foundations, our species gets smarter and better able to survive our virus-laden environments.
Of course, there are vital differences between the houses of knowledge inhabited by Galen and by Enders and Salk. The humor-ists assumed that their foundation was strong and true because it enabled them to draw analogies among all the different realms, from the biological to the social to the psychological to the astronomical. That made sense when we believed that God ordered His universe in the maximally beautiful way, that He gave us minds so we could appreciate His handiwork, and that our minds worked (in His image) by associating ideas. So, to see analogies was to see God’s order. Of course, we modern folks don’t think analogies are a scientific way of proceeding, or else we’d still think that because some tumors cause veins to swell and look like crabs, cancer must have something to do with the Crab constellation. We believe the firm foundation of knowledge consists not of analogies but of facts. The ancients and we moderns disagree about how to lay a firm foundation, but we firmly believe in foundations themselves.
Facts are facts. It’s a fact that polio vaccine is effective and it’s a fact that Cancer the constellation has nothing to do with tumors. But the idea that the house of knowledge is built on a foundation of facts is not itself a fact. It’s an idea with a history that is now taking a sharp turn.
In 2006, former President Bill Clinton wrote an op-ed in the
New York Times about the legacy of the welfare reform legislation he had championed a decade earlier.
4 His verdict: “The last 10 years have shown that we did in fact end welfare as we knew it, creating a new beginning for millions of Americans.” He then supported that assertion:
In the past decade, welfare rolls have dropped substantially, from 12.2 million in 1996 to 4.5 million today. At the same time, caseloads declined by 54 percent. Sixty percent of mothers who left welfare found work, far surpassing predictions of experts. Through the Welfare to Work Partnership, which my administration started to speed the transition to employment, more than 20,000 businesses hired 1.1 million former welfare recipients. Welfare reform has proved a great success.
At the heart of Clinton’s argument is a series of facts. If we wanted to argue against him, we might suggest that those facts are cherry-picked, out of context, or even blatant lies. But to do this, we’d have to offer our own set of facts. We might point out that the poverty level was declining even before Clinton’s act went into effect,
5 and that the number of Americans with zero income who receive food stamps has soared to about 6 million because Clinton’s bill cut off other sources of cash support.
6 President Clinton would certainly respond with more facts, for we fight facts with facts.
It wasn’t always so. In 1816, the British House of Commons debated creating a committee to look into requiring children to be at least nine years old before entering the workforce and limiting their workday to 12.5 hours. This was fiercely opposed by factory owners who were working children as young as six for up to sixteen hours a day. The arguments for and against the law were based on principles and generalities, not facts: “Such a proceeding was libel on the humanity of parents,” argued a Mr. Curwen, who believed that parents are the best judges of what’s best for their children.
7 Nevertheless, a committee was formed to investigate the situation. Even the experts called by the committee—quite a new practice in itself—were fact-free. Ashley Cooper, Esq., a surgeon, testified that he didn’t think children aged seven to ten could work more than ten hours a day without harming their health. “Upon a subject of this kind, one must answer upon general principles,” he said, pointing to the need for “air, exercise, and nourishment.” Sir Gilbert Blane, M.D., backed up that opinion, although he noted: “I have no experience as to manufactories, and, therefore, my answer must depend on general analogy.”
8 General principles may be true, and general analogies may work, but we moderns know that they need to be supported by reams of facts before they can be trusted. Bill Clinton would not have given us just some untested adages.
There certainly were facts before the start of the nineteenth century; it was a fact that the ocean was salty even before humans first tasted it, and it was a fact that polio is caused by a virus even before we had discovered viruses. But only relatively recently have facts emerged as the general foundation of knowledge and the final resort of disagreements.
Indeed, we didn’t have a word for facts until a few hundred years ago. When in 400 A.D. Jerome translated John 1:14 (“and the Word was made flesh”) into the Latin “et Verbum caro factum est,” “factum” meant “that which was done,” from “facere” (“to do”).
9 The word “fact” entered English in the early 1500s with that meaning, but by the 1600s facts were a narrower class of deeds, as in “He is . . . hanged . . . neere the place where the fact was committed” (1577).
10 Facts were evil deeds, so a murder would have been a fact, but not that the Pyramids are in Egypt. How did we manage so long without a word for what we currently mean by “fact”?
For us moderns, the hardest, most solid of facts are about particulars (there is a rock by the side of the road, there are six chairs around the table), but our ancestors tended to disdain the particular because it comes to us through bodily perception, a capability we share with all animals. For them, knowledge had to be something more than what we learn through our mere senses, because it is such a distinctly human capability of our God-given and God-like soul. Whereas perception sees individual items (this berry, that cat), knowledge discerns what this cat has in common with all other cats that makes it into a cat; knowledge sees its essence as a cat. For our ancestors, knowledge was of universals—not of facts about this or that cat. The idea that knowledge was a slew of facts about particulars would have struck them as a misuse of our God-given instrument.
So what happened in the nineteenth century to make facts the bedrock of knowledge? The path is twisty. Mary Poovey cites the invention in Italy of double-entry bookkeeping, which in the sixteenth century provided a process by which ledger entries could be proved accurate to anyone who, regardless of status, followed the proper procedure.
11 But most historians look to the seventeenth century, when the philosopher and statesman Francis Bacon, seeking to put knowledge on a more certain basis, invented the scientific method. Like Aristotle, he sought knowledge of universals.
12 But he proposed getting to them through careful experiments on particulars. For example, when Bacon wanted to find out how much a liquid expands when it becomes a gas, he filled a one-ounce vial with alcohol, capped it with a bladder, heated the alcohol until the bladder filled, and then measured how much liquid was left.
13 From this experiment on particulars, he was able to propose a theory that applied universally to heated liquids.
Having the particular ground the universal was a remarkable inversion of the traditional approach to knowledge: No longer derived by logical deduction from grand principles, theories would hereafter be constructed out of facts the way houses are built out of bricks.
14
One more turn and we get to modern facts. When trying to understand a word as basic as “fact,” the twentieth-century British philosopher John Austin recommended considering what it is being used in contrast to. Bacon contrasted facts with theories. But the modern sense of facts emerged only when we started contrasting them with people’s self-interest: the facts of what cleaning chimneys did to little boys versus the upper class’s interest in getting their chimneys clean, which led the rich telling themselves that hard work builds little boys’ character.
15 This change in the meaning of facts vaulted facts to the center of the social stage, for that’s where we often jostle about coordinating our interests. Facts went from what grounds scientific theories to what also grounds social policy.
As if in a stop-motion video that shows a flower blooming, we can see this rise in the social role of facts occurring within the life span of one great thinker. Thomas Robert Malthus, born in 1766, is best remembered for warning in 1798 that while populations grow geometrically (2, 4, 8, 16 . . . ), the food supply required to support them grows only arithmetically (1, 2, 3, 4, 5 . . . ). Here’s his famous “proof”:
16 First, That food is necessary to the existence of man.
Secondly, That the passion between the sexes is necessary and will remain nearly in its present state....
Assuming then, my postulata as granted, I say, that the power of population is indefinitely greater than the power in the earth to produce subsistence for man.
Malthus doesn’t painstakingly assemble facts about population growth and crop yields. Instead, he gives us a logical deduction from premises that he presents as self-evident. From this he goes further: Starvation is inevitable, and therefore those born poor should not be sustained by the government since there isn’t enough food to go around anyway. We must raise our moral standards and procreate less.
Malthus spends the rest of the book unfurling a series of bold, unsupported generalizations explaining why the various populations of the earth haven’t already grown themselves to the point of starvation. Even when dealing with his own countrymen, he proceeds in a remarkably fact-free fashion. The “higher classes” don’t feel the need to marry because of the “facility with which they can indulge themselves in an illicit intercourse.”
17 Tradesmen and farmers cannot afford to marry until they’re older. Laborers can’t afford to divide their “pittance among four or five” and thus do not overpopulate. Servants would lose their “comfortable” situations if they married. The modern researcher would be appalled at these generalizations. What are the average family sizes for each of these classes? What is the average salary of laborers? What are the costs of raising a family of four or five? How common is “illicit intercourse” among the rich, and how does it compare to the rate among the other classes? If Malthus had submitted his book to a sophomore college class for his final paper, he would be sent back to a course on Remedial Methodology to rewrite it.
Throughout his life, Malthus did indeed rewrite his masterwork. By the time he published the sixth and final edition in 1826, it was dense with facts, statistics, and discussions of the validity of contrasting studies. He compared mortality rates in different regions, explained anomalous statistical results, and in general behaved like a fact-based modern researcher. The change clearly was due in part to the availability of more facts. But facts became more available because their status had grown. Fact-based knowledge was arriving, spurred on in part by Malthus’s own work, and driven by reformers who wanted society to face the miserable reality of England’s working class and poor. In that political struggle, facts got contrasted with interests, and not just with theory.
And interests could not have been much further apart than they were at the beginning of the nineteenth century, when the upper class, secure that its position was part of the divine plan, felt no compunction about setting children to work in factories or sending boys as young as five up chimneys as narrow as seven inches square. In 1819, when the British House of Commons considered a new bill that would have kept children younger than fourteen out of chimneys, a member of the House named Mr. Denman argued that it was better that the boys be gainfully employed than that they engage in “the fraud and pilfering which was now so common among boys of tender age.”
18 A Mr. Ommaney agreed, for the chimney sweeps he had seen were “gay, cheerful, and contented.” And, added Mr. Denman, there was no other reliable way to clean the really small flues for which little boys are perfectly shaped.
Yet, the tide was turning. The proponents of the reform bill countered Mr. Ommaney’s vision of “gay, cheerful, and contented” chimney sweeps with factual evidence from physicians that these boys “exhibited every symptom of premature old age.” The proponents won. The fact of the misery of these children overcame the old assumption that the poor are poor because they deserve to be—a belief that conveniently supported the self-interest of Mr. Denman, Mr. Ommaney, and the rest of the old guard.
The chimney sweep bill was just one sign, not a turning point. The triumph of facts was gradual. But it was greatly enabled by a single thinker who provided the intellectual framework for basing policies on facts rather than on moral assumptions or the interests of those in power. Jeremy Bentham was “one of the few great reformers to be appreciated in his own lifetime.”
19 By the time he was four, he was learning Latin and Greek. By the time he was five, he was known as “the philosopher.” At age thirteen he was admitted to Oxford and trained to become a lawyer like his father. But he was too curious to stay within any one domain and became, among other things, a philosopher best remembered for the principle of utility: Since pleasure and pain are equal motivators for all people, Bentham declared, the ultimate criterion for evaluating an action was whether it would result in “the greatest happiness of the greatest number.” Bentham thus gave us a new way of evaluating social policies, a type of bookkeeping committed to improving the ledger overall.
Applied to government, Bentham’s ideas were radical. Suppose the happiness of Mr. Denman and Mr. Ommaney did not count for more than the happiness of the boys sweeping their chimneys. Suppose government policies should be guided by a pragmatic sense of what works to increase the overall happiness. If so, the government would first need to survey what life was actually like for all of its citizens. It would need to base policy on facts.
But that in turn required the use of a tool only recently gaining credence: statistics. The word itself only entered English around 1770, coming from a German word for information about the state (which explains the “stat” part of the word).
20 Statistics were meant to be independent of opinions and conclusions. They became the way the social reform movement argued against the conclusions that interested parties might have preferred.
21
In the 1830s, statistics gave Bentham’s ideas a method, and Parliament, partially due to Bentham’s influence, soon commissioned reports on poverty, crime, education, and other social concerns. Distributed in “blue books” rich with anecdotes, interviews, and statistical tables, these reports put parliamentary debate of social issues on a factual basis, even if their statistical methods were not up to modern standards. Blue books also provided material for popular novels that further advanced the social reform movement.
22
One of the most famous of these novelists of social reform was, of course, Charles Dickens. But in London, blue books were flying into Parliament and off the shelves at such a pace that by 1854 Dickens was part of a backlash against the whole fact-based approach. In
Hard Times, Dickens has the schoolmaster Thomas Gradgrind exhort his students: “‘Fact, fact, fact!’”
23 The unsympathetic Mr. Gradgrind tells his students that they must not decorate their future homes with carpets with floral designs. “‘You don’t walk upon flowers in fact; you cannot be allowed to walk upon flowers in carpets. You don’t find that foreign birds and butterflies come and perch upon your crockery; you cannot be permitted to paint foreign birds and butterflies upon your crockery.’”
24 Facts, for Dickens, stood in contrast to imagination and art, and were too dry a way to understand human life.
Dickens made it clear to his readers that the rise of facts to which he objected was coming out of the political sphere. Gradgrind tells his students: “‘We hope to have, before long, a board of fact, composed of commissioners of facts, who will force the people to be a people of fact, and of nothing but fact.’”
25 In case the political reference wasn’t clear enough, Dickens tells us that Mr. Gradgrind’s room had an “abundance of blue books.” Says Dickens: “Whatever they could prove (which is usually anything you like), they proved there.... In that charmed apartment, the most complicated social questions were cast up, got into exact totals, and finally settled.... As if an astronomical observatory should be made without any windows, and the astronomer within should arrange the starry universe solely by pen, ink, and paper.”
26 Dickens had enormous sympathy for the poor, having worked in a shoe polish factory at the age of twelve and having watched his father taken away to debtors’ prison.
27 But facts in blue books didn’t reveal the truth. For that you needed to understand in depth and compassionately the lived plights of social unfortunates, the way we do when reading a novel. What a coincidence!
Even with the incredibly popular Mr. Dickens railing against over-reliance on facts, they continued their rise to prominence. With the invention of “fact-finding missions,” facts became the basis for resolving international disputes. Indeed, it’s hard to believe that they’re a modern invention, but the first mention of a fact-finding mission in the
New York Times occurred in 1893 when President Grover Cleveland sent someone to investigate the dethroning of Hawaii’s last queen, Lili’uokalani.
28 These commissions became a normal procedure only after the Hague conference created its first fact-finding mission in 1904, when five countries jointly investigated the mistaken sinking of an English trawler by Russia’s Baltic Fleet.
29 Russia paid England reparations for the so-called Dogger Bank Incident, and for the first time, an international dispute was settled by uninvolved—disinterested—countries working to establish the facts and nothing but the facts. By the 1920s, fact-finding missions had become a normal and accepted part of how countries tried to settle their problems
30—possibly because the 16.5 million deaths in World War I showed that the other popular method of settling disputes didn’t work so well. These days, if something large enough goes wrong, we create a fact-finding mission as if this were a natural and age-old way of proceeding.
Over the course of two hundred years, facts came a long way. From the opposite of theories, to the opposite of self-interest, to the way unfriendly countries avoid war, facts became the elemental truths about the world—truths that are true regardless of what we may think or want to believe. Journalists gathered facts, almanacs aggregated facts, board games quizzed us on them, experts predicted entire baseball seasons based on previous seasons’ facts, governments prepared to deploy Armageddon’s weapons based on cold-hearted assessments of facts. Facts had hit rock bottom, which is exactly where we wanted them.
Darwin’s Facts
One day in the 1850s Henry David Thoreau observed a bird he hadn’t seen before “flapping low with heavy wing.” As the bird flew overhead, Thoreau caught sight of two spots on the bottom of its wings and realized it was a kind of gull. “How sweet is the perception of a new natural fact!”
31 Thoreau chirruped. A new fact had been uncovered: This particular bird was a gull. Thoreau’s fact is in the fact’s most basic form: Some
this is a
that.
Yet Thoreau’s identification of that bird wasn’t the sort of fact that does the heavy-lifting of knowledge. It did not advance our knowledge of gulls, of wings, or even of spots in any appreciable way. Thoreau was not that ambitious. As Ralph Waldo Emerson lamented in his eulogy of his friend, “instead of engineering for all America, he was the captain of a huckleberry party.”
32
While Thoreau was picking huckleberries, Charles Darwin was spending seven years intently exploring the small world of Cirripedia—barnacles. The two resulting dry and difficult volumes—so little like his masterful On the Origin of Species published just a few years later in 1859—are careful recitations of facts that together describe the little creatures in unrelenting detail. But they lead up to a this is a that far more consequential than Thoreau’s. It is a classic example of how fact-based knowledge has worked, and has worked so well that it can be worth spending seven years going to the family dinner table smelling of dead crustaceans.
Darwin’s work on barnacles began with the accidental discovery of a small, persistent fact. In 1835, before he had formulated his great theory, Darwin was a young man sailing on the Beagle, exploring the small variations in the plants and animals of the Galapagos Islands. There he discovered tiny barnacle parasites inside the shell of a mollusk—highly unusual for creatures that usually attach to rocks. Examining the parasites more closely, he found tiny larvae that looked surprisingly like crustacean larvae. Mollusks and crustaceans were classified as separate species, so why would one produce larvae of another? Darwin filed the question away until 1846, when for the next seven years it fully absorbed him.
To call this work painstaking would be to underestimate it vastly. But the minutiae of his work with needle-like dissection tools and magnifying equipment was guided by a grand theory. The idea that organisms evolve by small steps led Darwin to look for continuities among them. Accordingly, he probed hermaphroditic barnacles and found male organs that were so “unusually small” that he would not “have made this out, had not my species theory convinced me, that an hermaphrodite species must pass into a bisexual species by insensibly small stages.”
33
Darwin’s first volume on barnacles is 370 pages long but is really about a single fact: Barnacles are crustaceans. This fact could not have been uncovered by thinking about the world while sitting on the banks of Walden Pond, and bringing your laundry home for your mother and sister to wash (as Thoreau did).
34 Darwin’s fact required a trip from England to the Galapagos, the close inspection of a mollusk with parasites, the acquisition of multiple collections of specimens, seven years of exacting dissections, and a world-changing theory of animal origins.
How sweet indeed is the perception of a new natural fact.
Now flash forward to the present and ask
Hunch.com for help in any matter of taste. What city should I visit? What character should I dress up as for Halloween? What Chinese vegetable should I cook tonight?
Hunch.com will supply statistically significant answers based upon the rippling, overlapping similarities among all its users. For this to work, Hunch has to know lots about its users—so much that asking them to fill in a typical profile (“Favorite type of music,” “Politics: left, middle, or right”) would not even begin to suffice. Hunch is looking for a sort of fact that would have confounded Darwin, Thoreau, and most of us just a few years ago.
When I first visited the site, wondering what movie I should see that evening, Hunch asked me a series of questions that had nothing to do with movies. Do I store my drinking glasses right side up? Would I prefer to wear running shoes, boots, or sandals? When I throw out a sheet of paper, do I crumple it? Have I ever touched a dolphin? I have answered a total of 334 such questions since I began using the site, primarily because answering them is surprisingly fun. On the basis of my answers, Hunch recommended the movies
28 Days, Casablanca, The Fugitive, and
The Big Lebowski. Hunch.com has got my number.
It got that number by analyzing my answers in the context of millions of answers given by other users. The analysis is purely statistical, in a way that the nineteenth-century scientists and statisticians would not have foreseen. The analysis is not in support of a theory and it produces no theory. Hunch has no idea why people who, say, prefer to wear sandals on a beach and who have not blown a dandelion in the past year might like those four movies. It doesn’t have a hypothesis and it doesn’t have a guess. It just has statistical correlations.
Hunch’s facts—how I store my drinking glasses and whether I’ve recently blown on a dandelion—are the opposite of Darwin’s facts:
Darwin’s facts were hard-won. He spent seven years establishing that barnacles are crustaceans. At Hunch, you can answer 12 questions a minute. The average user has answered about 150 of them. Facts are fast and fun at Hunch.
Darwin’s facts were focused on a particular problem: understanding what sort of critter the barnacle is. Hunch’s facts are purposefully unconstrained. One moment you’re answering a question about your favorite ABBA song and the next you’re declaring whether you consider Russia to be part of Europe. Hunch needs answers to be spread wide and thin in order to generate useful results.
Darwin’s facts together cover some finite topic. In 370 pages, Darwin goes through all of the relevant facts about the three types of barnacles, and nails his argument. Granted, that’s a lot of pages and even more facts, but it has a beginning and an end. It fits between covers. Hunch’s facts don’t “cover” anything. In its first seven months, the site gathered over 7,000 different questions, almost all of them from its users. The only stopping point is when you’re tired of answering silly questions. And even then you can always go back for more.
Darwin’s facts existed before he discovered them. Hermaphroditic barnacles had tiny male organs before Darwin peeked at them. It’s not nearly as clear whether Hunch is uncovering or generating facts. It’s a fact about me that I would prefer cotton candy to a shoe shine, but since I had never before considered the comparison, it feels like a fact that didn’t quite exist before someone asked. If my never having touched a dolphin counts as a fact, then so must the fact that I have never touched a Klingon, or a purple lime, or a blue lime, or a plaid lime—an infinite series of facts that didn’t exist until someone asked.
Darwin’s facts emerged because he had a theory that guided him. Otherwise, why care about hermaphroditism in barnacles? Hunch doesn’t know why your preference in salty snack foods might help predict your favorite type of poker, and it doesn’t care.
Finally, when Darwin noticed a parasitic barnacle in the Galapagos, it was a fact worth remembering only because he assumed—correctly—that this individual barnacle was representative of a species. When he writes in Volume 1 that “in
L. anatifera alone, the uppermost part of the peduncle is dark,”
35 he’s referring not to an individual barnacle’s peduncle but to the species’ pedunculosity. For Darwin, the facts worth noting are the ones that apply to more than one individual. Exactly the opposite is the case at Hunch. “Are you an air-breather?” is not a helpful question at Hunch because all of its mammalian users will reply the same way.
Now, Hunch is not producing results on the order of Darwin’s barnacle studies or his Origin of Species. Nor do Hunch’s facts replace the need for Darwinian-style facts (although in Chapter 7 we will see how science is using some of Hunch’s basic techniques). Hunch is doing something useful—helping you find the next movie to see or the right wedding gift to buy—but it’s not making any serious claim to producing eternal knowledge. It’s just about, well, hunches.
Nevertheless, Hunch is a trivial example of a serious shift in our image of what knowledge looks like. Darwin’s facts were relatively scarce both because they were hard to obtain—seven years dissecting barnacles—and because they were hard to get published. Some facts are still so hard to obtain that multi-country consortia have to spend billions of dollars building high-energy particle colliders to get them to show their quantum-scale faces. But our information technologies are precisely the same as our communication technologies, so learning a fact can be precisely the same as publishing a fact to the world. The Internet’s abundant capacity has removed the old artificial constraints on publishing—including getting our content checked and verified. The new strategy of publishing everything we find out thus results in an immense cloud of data, free of theory, published before verified, and available to anyone with an Internet connection.
And this is changing the role that facts have played as the foundation of knowledge.
The Great Unnailing
The late, revered senator from New York, Daniel Patrick Moynihan, famously said, “Everyone is entitled to his own opinions, but not to his own facts.”
Perhaps this is what President Barack Obama had in mind when he took as his first executive action the signing of the “Transparency and Open Government” memorandum, requiring executive-branch agencies to “disclose information rapidly in forms that the public can readily find and use.”
36 Two months later, Vivek Kundra, Obama’s pick for the new post of federal Chief Information Officer, announced plans to create a site—
Data.gov—where executive-branch agencies were required to post all their nonsecret data so the public can access it—everything from requests received by the Department of Agriculture for permits for genetically engineering plants to the National Cemetery Administration’s customer satisfaction surveys. When
Data.gov launched, it had only 47 datasets. Nine months later, there were 168,000,
37 and there had been 64 million hits on the site.
38
Obama’s executive order intended to establish—to use a software industry term—a new default. A software default is the configuration of options with which software ships; the user has to take special steps to change them, even if those steps are as easy as clicking on a check box. Defaults are crucial because they determine the user’s first experience of the software: Get the defaults wrong, and you’ll lose a lot of customers who can’t be bothered to change their preferences, or who don’t know that a particular option is open to them. But defaults are even more important as symbols indicating what the software really is and how it is supposed to work. In the case of Microsoft Word, writing multi-page, text-based documents, and not posters or brochures, is the default. The default for Ritz crackers, as depicted on the front of the box, is that they’re meant to be eaten by themselves or with cheese.
39
Before Obama’s order, most government data was by default unavailable to the public. The Environmental Protection Agency used to release the results of its highway mileage tests, but not the data such tests were based on. After the new default went into effect, at the EPA’s data site—
FuelEconomy.gov—you can download a spreadsheet of mileage testing information that will tell you not just that a 2010 Prius gets 51MPG in the city but also that the average annual fuel cost should be $780 and it’s got a “Multipoint/sequential fuel injection” system; there’s even information about hydrogen fuel cell cars that don’t yet exist.
40 Advocates of open government hope that changing the default will make it easier to hold the government accountable, and will spur the development of new software applications that make use of those data, the way US Geological Survey and Census Bureau data have been put to unexpected uses in the past.
The agency-wide change in default effected by Kundra on behalf of the Obama administration was intended to signal something important about the role and nature of government, but it also tells us something about the changing role and nature of facts.
FuelEconomy.gov may give us one hundred categories of data, but there are no columns for the ambient temperature, the pounds per square inch of the air in the tires, or the pull of the moon on the day the road tests were done, all of which might have some small effect on the data. We know that there could be another hundred, thousand, or ten thousand columns of data, and reality would still outrun our spreadsheet. The unimaginably large fields of data at
Data.gov—we are back to measuring stacked
War and Peaces—do not feel like they’re getting us appreciably closer to having a complete picture of the world. Their magnitude is itself an argument against any such possibility.
Data.gov and
FuelEconomy.gov are not parliamentary blue books. They are not trying to nail down a conclusion.
Data.gov and the equivalents it has spurred in governments around the world, the massive databases of economic information released by the World Bank, the entire human genome, the maps of billions of stars, the full text of over 10 million books made accessible by Google Books, the attempts to catalog all Earth species, all of these are part of the great unnailing: the making accessible of vast quantities of facts as a research resource for anyone, without regard to point of view or purpose. These open aggregations are often now referred to as “data commons,” and they are becoming the default for data that has no particular reason to be kept secret.
This unnailing is perceived by some people as potentially dangerous. For example, not long after
Data.gov was begun, open government proponents were surprised to read an article in a liberal journal by one of their great advocates, Lawrence Lessig, about open data’s downside. In the article, titled “Against Transparency,” Lessig warned that making available oceans of uninterpreted data may lead politically motivated operatives to draw specious connections: Every time a candidate accepts funds from a lobbying group and votes for a bill that the lobbyists favor, such operatives could claim that this is proof that the candidate is corrupt. Thus, this unnailed data could further enable an accusatory culture. (Lessig’s article proposes reducing citizens’ cynicism by reforming the United States’ campaign finance process.)
41
This is a second irony of the great unnailing: The massive increase in the amount of information available makes it easier than ever for us to go wrong. We have so many facts at such ready disposal that they lose their ability to nail conclusions down, because there are always other facts supporting other interpretations. Let’s say I gather data on global climate change from World Resources Institution’s collection of information from two hundred countries, and you grab some different data from Fauna Europaea’s database of the distribution of species. You don’t like my conclusion? Within a couple of seconds, you can fill your own bag with facts.
Our foundations are harder to nail down than they used to be.
Facts have changed not only their role in arguments but their own basic shape. We can distinguish three phases in the recent history of facts (although the division is much messier than that).
First, there was the Age of Classic Facts, represented by Darwin with a dissecting kit and by parliamentary blue books. These facts were relatively sparse, painstakingly discovered, and used to prove theories.
Then, in the 1950s we entered the Age of Databased Facts, represented by punchcards stacked next to a mainframe computer. We thought we had a lot of information then, but it would have taken just under 2 billion cards to store what’s on a rather wimpy 200-gigabyte hard drive on a laptop—a stack about 300 miles high.
42 So, of course the databases of the time had to strictly limit the amount of information they recorded: the employee’s name, date of birth, starting date, and Social Security number, but not hobbyist skills or countries lived in. The Age of Data still conformed to our ancient strategy for knowing the world by limiting what we know—a handful of fields, chosen and organized by a handful of people.
Now, in the Age of the Net it makes sense to talk about networked facts. If classic facts and databased facts are both taken as fundamentally isolated units of knowledge, networked facts are assumed to be part of a network. Networked facts exist within a web of links that make them useful and understandable. For example, in the days of print, the tables of data in a scientific article were tiny extracts from masses of facts and data that themselves were not published. Now, on the Internet, scientific journals are increasingly hyperlinking from the data in their articles to the databases from which they are drawn. For example, when an article in the journal
Public Library of Science Medicine 43 examines “the predictors of live birth” in
in vitro fertilization by analyzing 144,018 attempts, it links to the UK open government site where the source data—“the world’s oldest and most comprehensive database of fertility treatment in the UK”—is available.
44 The new default is: If you’re going to cite the data, you might as well link to it. Networked facts point to where they came from and, sometimes, where they lead to. Indeed, a new standard called Linked Data is making it easier to make the facts presented in one site useful to other sites in unanticipated ways—enabling an
ad hoc worldwide data commons. Key to Linked Data is the ability for a computer program not only to get the fact but to ask the resource for a link to more information about the context of the fact.
45
Facts have become networked because our new information infrastructure happens also to be a hyperlinked publishing system. If you’re going to make a fact visible, it’s so easy to link it to its source that you’ll need some special justification not to do so. But our new network doesn’t just unify our information and publishing systems. It also integrates us with other people. In the EPA’s database of car mileage, the Prius’s 51 miles per gallon is just a number. Once that fact is embedded in a labeled table, it becomes meaningful. Then, when someone posts it on a page, it picks up more meaning. Whatever point that page is making—Prius’s mileage is great, isn’t great enough, is a sham—it’s quite likely that somewhere another page links to that one to argue the other way. Thus, the networked datum “51” points back to a traditional database, but also points ahead into the unruly context of networked discussion. This makes our ordinary encounter with facts very different from what it used to be. We don’t see them marching single-file within the confines of an argument contained within a blue book, a scientific article, or a printed tome. We see them picked up, splatted against a wall, contradicted, torn apart, amplified, and mocked. We are witnessing a version of Newton’s Second Law: On the Net, every fact has an equal and opposite reaction. Those reactive facts may be dead wrong. Indeed, when facts truly contradict, at least one of them has to be wrong. But this continuous, multi-sided, linked contradiction of every fact changes the nature and role of facts for our culture.
When Daniel Patrick Moynihan said, “Everyone is entitled to his own opinions, but not to his own facts,” what we heard was: Facts give us a way of settling our disagreements. But networked facts open out into a network of disagreement. We may miss the old Age of Classic Facts, but we should recognize that its view of facts was based not in fact but in the paper medium that published facts. Because of the economics of paper, facts were relatively rare and gem-like because there wasn’t room for a whole lot of them. Because of the physics of paper, once a fact was printed, it stayed there on the page, uncontradicted, at least on that page. The limitations of paper made facts look far more manageable than they seem now that we see them linked into our unlimited network.
Of course, there are important domains where facts play their old role, and we would not want it otherwise—lots of lives were saved because Jonas Salk had a methodology for proving that his vaccine worked. The Net lets us find that fact, and explore its roots. And yet, the longer you are on the Net, the more fully you realize that in so many areas, facts fail at their old job. The people who think vaccines cause autism, the ones who still think Barack Obama was born in Kenya, the ones who think the government is hiding proof that aliens walk among us, they all have more facts than ever to prove their case. And so do those who think (as I do) that those beliefs are crazy. We see all too clearly how impotent facts are in the face of firmly held beliefs. We have access to more facts than ever before, so we can see more convincingly than ever before that facts are not doing the job we hired them for.
Let me stress that the old role of facts does not vanish from the Net. Scientists still establish facts as in the old days, thankfully. Policy debates continue to try to ground their conclusions in facts, although as always there are fierce arguments over which facts are relevant and what to make of them. And, importantly, the realm of commoditized facts—facts that a large community of belief accepts as not worth arguing about—is growing, as is access to those facts: Anyone with a Web browser can get a figure for the population of Pittsburgh that for almost all conceivable purposes will count as reliable enough. But push on a fact hard enough, and you’ll find someone contradicting it. Try to use facts to ground an argument, and you’ll find links to those who disagree with you all the way down to the ground. Our new medium of knowledge is shredding our old optimism that we could all agree on facts and, having done so, could all agree on conclusions. Indeed, we have to wonder whether that old optimism was based on the limitations inherent in paper publishing: We thought we were building an unshaken house based on the foundation of facts simply because the clamorous disagreement had no public voice.
In short, while facts are still facts, they no longer provide the social bedrock that Senator Moynihan insisted on.
And, by the way, there is no solid certainty that Senator Moynihan ever actually said “Everyone is entitled to his own opinions, but not to his own facts.” It might have been a variant, such as “You are entitled to your own opinion, but you are not entitled to your own facts.” It might actually have been James Schlesinger who said it. That Senator Moynihan ever uttered that phrase simply is not a known fact.
I learned that on the Internet.
46