Chapter Eight

A BROKEN CULTURE

image

BIOMEDICAL SCIENCE was not always the hypercompetitive rat race that it has become in recent years. Consider the story of Charles Darwin, as he developed his theory of evolution through natural selection. That discovery became the organizing principle of biology. And the story of how it arose bears almost no resemblance to the way biology and medicine advance today. Darwin spent decades gathering observations and gathering his thoughts. He studied odd little finches in the Galapagos Islands. He pored over collections of insects. Barnacles held his interest for nine years. He spent decades breeding pigeons and soaking seeds in saltwater to see if they could survive long ocean voyages and take root across the sea. He didn’t start out with a coherent hypothesis; he was simply driven by curiosity. In fact, today’s science institutions would reject his approach, Arturo Casadevall told me. “He didn’t stick to one thing. He had no mechanism. And yet he was able to synthesize something that is really the only coherent thing that holds biology together.”

Darwin’s nineteenth-century career is also different in another important way. As a gentleman-scientist, he had no need to hustle for money. And he was in no hurry to publish his discoveries. He did so reluctantly only after becoming aware that a young rival named Alfred Russel Wallace was developing a similar theory. Darwin resisted his friends’ entreaties to put his ideas on paper to claim them for himself. “I rather hate the idea of writing for priority,” Darwin said in a letter to colleague Charles Lyell, “yet I certainly should be vexed if any one were to publish my doctrines before me.” But then, in those days of the gentleman-scientist, sailing ships, and handwritten correspondence, the stakes were mostly personal pride.

How science has changed. In contrast to the languid years of research during Darwin’s day, the high pressure of competition can tempt even the best scientists into dangerous territory. Carol Greider, who shared the Nobel Prize for her discovery of telomerase, tells a cautionary tale about her early career. Her discovery triggered a race to find out more about this vital enzyme. Telomerase turns out to be composed of genetic material (RNA) and a protein component, coupled together. Greider was working feverishly with a postdoctoral researcher at the Cold Spring Harbor Laboratory on Long Island to identify the protein, while another team was in hot pursuit at the University of Colorado.

Greider and her postdoc isolated two molecules that appeared to fit the bill. Hearing the hoofbeats of competition, she rushed that finding into print. At a meeting a while later, she ran into her chief competitor, Joachim Lingner, who congratulated her but added that he was not giving up on his independent search for the telomerase protein. Greider told me she welcomed that. After all, science is based on the idea that other investigators should verify discoveries. Or not. Soon thereafter, Lingner and his mentor, Tom Cech, published a paper showing convincingly that they had isolated a completely different protein, which was in fact the actual component of the telomerase enzyme. They dubbed it TERT. “It was very clear that he was right,” Greider said. She wrote another paper declaring that her proteins were not in fact part of telomerase.

“It was a pressure-to-publish situation,” she said, “and some of the experiments weren’t as good as they could be, but I let myself be pushed around.” Science doesn’t happen in a social vacuum, and in this case the postdoc in Greider’s lab needed a publication in her name to help land a job. Greider felt the squeeze. On the one hand, their findings were provocative and no doubt publishable. On the other hand, the paper itself pointed out some potentially serious shortcomings in the data. If she’d had all the time in the world, Greider would have worked to resolve those lingering questions, but she says her higher-ups were complaining that her reluctance to publish was hampering the career of her postdoc. Today she chalks the episode up to her inexperience as a young investigator. Unfortunately those career pressures persist; indeed, they are even worse today.

“If you think about the system for incentives now, it pays to be first,” Veronique Kiermer, executive editor of the Public Library of Science (PLOS) journals told me. “It doesn’t necessarily pay to be right. It actually pays to be sloppy and just cut corners and get there first. That’s wrong. That’s really wrong.” This perverse incentive is warping biomedical science. To keep funding flowing, researchers often choose projects that are likely to succeed quickly over those that will provide bold and deep insights. To make matters worse, there is an enormous mismatch between the number of scientists pursuing research and the funding that’s available to them. There’s no objective way to know just how many scientists is the right number, but absent more funding, there are too many in the system right now. As a result, scientists are actually rewarded for conducting their research with less rigor and publishing dubious results. These pressures start to mount in the very earliest days of a scientist’s career. Eager students with bright ideas and high ideals find themselves swimming against a strong tide.

image

Kristina Martinez didn’t know she wanted to be a scientist as she was growing up in a small town in rural Virginia, where her extended family raised sheep and cattle and tapped maple trees to make syrup. But, having been one of thirty-five kids in her high school graduating class, she decided to venture out into the world and give the University of North Carolina, Greensboro, a try. Martinez started studying nutrition and gradually got drawn into a laboratory where the professor was studying the biochemistry of obesity. She was hooked. She stayed on to get a PhD, as well as a degree as a registered dietician, and in 2012 set out blithely to start a career in research. “I did not know what I was getting into,” she told me.

Once young biomedical scientists finish their PhDs, they go into a twilight world of academia: postdoctoral research. This is nominally additional training, but in fact postdocs form a cheap labor pool that does the lion’s share of the day-to-day research in academic labs. Nobody tracks how many postdocs are in biomedicine, but the most common estimate is that there are at least 40,000 at any given point. They often work for five years in these jobs, which, despite heavy time demands, usually pay less than $50,000 a year—a rather modest salary for someone with an advanced degree and quite possibly piles of student debt.

All this would be worth the sacrifice if a research job were waiting at the end of the process. But the job market in academic research is bad and has been getting far worse. A study by the National Institutes of Health (NIH) looking at data from 2008 showed that only about 21 percent of postdocs ended up getting a tenure-track job, and the trend has been sharply downward, as the number of postdocs has ballooned. Martinez, like many of her peers, is holding onto the slim hope that she will be one of the fortunate few to land that kind of job. She had no idea how much of a long shot getting an academic research position would be when she took a postdoc position at the University of Chicago. “It just makes it scary. Now I’m in it, there’s nothing really I can do about it. All I can do is the best that I can, and just hope for the best. I am trying to keep a level head about it. I’ve had my heart set on research so long now that I don’t want to consider other options.”

When she arrived at her postdoc job, Martinez started chipping away at several research projects at the same time. She also devoted her attention to helping young students, who juggle many different projects in her boss’s lab. Three years into her postdoc, she had lots of stimulating ideas but no polished results to publish in the scientific literature. “Without a publication as a postdoc, you’re kind of dead in the water,” she said. Her boss was willing to help her apply for a federal grant so she could get funding of her own, but the University of Chicago wouldn’t even consider that until Martinez had a journal article showing the results of her work. And not just any journal would do. Martinez figured she would need to get into a journal with a high “impact factor,” a measurement invented for commercial purposes: the rating helps journals sell ads and subscriptions. But these days it’s often used as a surrogate to suggest the quality of the research. Journals with higher impact factors publish papers that are cited more often and therefore commonly presumed to have more significance. At the top of the heap, the journal Nature has an impact factor over 40; Cell and Science have impact factors over 30.

These journals attract the flashiest work—though not necessarily the most careful or the most important. Martinez says her research isn’t eye-popping enough to end up in one of the big-three journals. “As a postdoc my expectation for myself is to get something published in a journal with [an impact factor of] nine or above. I’d be happy with that. Fourteen would be nice.… That’s what I’m going for.” She actually prefers journals with lower profiles—the reviews are more careful, she says, and the work they publish is more detailed and nuanced. Peers in her niche field are also more likely to read them. But she put weight on the impact factor “because that’s what’s expected and what I need to do to push my career along.”

Often times, hiring committees won’t even look twice at an application if the job seeker isn’t the lead author of at least one paper in a top-tier journal. Carol Greider at Johns Hopkins University said it’s a poor measure of talent, but universities face a tough job in a market glutted with job seekers. “We just hired a new assistant professor in the department, and we had four hundred applications for one job,” Greider said. “How do you filter those people? A lot of times the committees just scan down and look at how many high-profile papers there are.” Only after winnowing the pile of resumes do hiring committees start to examine the actual research that the applicants have performed.

image

Journal publications have overwhelmingly become the yardstick of talent in biomedical science. Job seekers depend on them. So do scientists seeking promotion, tenure, and federal grants. “I can’t tell you the number of times I’ve sat in a review panel and someone says, so-and-so published two papers in Cell, two in Nature and one in Science,” Gregory Petsko, a professor at the Weill Cornell Medical College, told a crowd of postdocs at a meeting in Chicago. In those sessions, “I’ve raised my hand and asked in my best meek voice… ‘Can you tell me what’s in those papers?’ Most of the time they can’t. They haven’t had time to read those papers. So they’re using where someone publishes as a proxy for the quality of what they published. I’m sorry. That’s wrong.” Raising his voice, he continued, “A lot of great science gets published in [less flashy] journals, while crap gets published in the single-word journals.” He coyly avoided naming Science, Nature, and Cell (he called it “Hell”) and wouldn’t even utter the phrase “impact factor” because he found the very concept so odious.

Veronique Kiermer served as executive editor of Nature and its allied journals from 2010 to 2015, when this issue came to a boil. She, too, says she’s unhappy that hiring committees and tenure review boards look first at where material has been published. She’s dismayed that the editors at Nature are essentially determining scientists’ fates when choosing which studies to publish. Editors “are looking for things that seem particularly interesting. They often get it right, and they often get it wrong. But that’s what it is. It’s a subjective judgment,” she told me. “The scientific community outsources to them the power that they haven’t asked for and shouldn’t really have.” Impact factor may gauge the overall stature of a journal, “but the fact that it has increasingly been used as a reflection of the quality of a single paper in the journal is wrong. It’s incredibly wrong.”

On the December day in 2013 when Sweden’s King Carl XVI Gustaf awarded him the Nobel Prize in Physiology or Medicine, Randy Schekman seized his moment in the public spotlight to publish an op-ed piece decrying the tyranny of the impact factor and, in particular, the journals Cell, Nature, and Science. (These journals are so deeply embedded in the everyday lives of scientists that Schekman himself had a framed cover of Cell in his office at the University of California, Berkeley, the issue containing one of his most celebrated publications. He winced a bit when I asked him about it and said maybe he should take it down.) If this is such a poor measure of scientific performance, I asked him, why don’t universities just ignore it? “Because it’s a very easy surrogate,” he replied. “It’s a number. Deans are bean counters. They like a simple number.”

Schekman said the problem with impact factors is not only that they warp science’s career system. “It’s hand in hand with the issue of reproducibility because people know what it takes to get their paper into one of these journals, and they will bend the truth to make it fit because their career is on the line.” Scientists can be tempted to pick out the best-looking data and downplay the rest, but that can distort or even invalidate results. “I don’t want to impugn their integrity, but cherry picking is just too easy,” he said. And bad as it is in the United States, Schekman said, it’s even worse in Asia, “where the [impact factor] number is sacred. In China it’s everything.” Schekman serves on a committee in Korea that rates top-level biomedical science proposals. The scientists list as their personal goals to publish a certain number of papers in journals with high impact factors. “It doesn’t matter what they’re publishing,” he said. The journal is all that counts. Chinese scientists get cash bonuses for publishing in Science, Nature, or Cell, and Schekman said they sell coauthorships for cash. That practice would fail the test of scientific integrity in the United States. Schekman helped establish eLife in part to combat the tyranny of impact factors. He said he told people at Thomson Reuters, the company that generates the rating, that he didn’t want one. They calculated one anyway.

Sometimes gaming the publication system can be as easy as skipping a particular experiment. Olaf Andersen, a journal editor and professor at Weill Cornell Medical College, has seen this type of omission. “You have a story that looks very good. You’ve not done anything wrong. But you know the system better than anybody, and you know that there’s an experiment that’s going to, with a yes or no, tell you whether you’re right or wrong.” Andersen told me. “Some people are not willing to do that experiment.” A journal can crank up the pressure even more by telling scientists that it will likely accept their paper if they can conduct one more experiment backing up their findings. Just think of the incentive that creates to produce exactly what you’re looking for. “That is dangerous,” Kiermer said. “That is really scary.”

Something like that apparently happened in a celebrated case of scientific misconduct in 2014. Researchers in Japan claimed to have developed an easy technique for producing extraordinarily useful stem cells. A simple stress, like giving cells an acid bath or squeezing them through a tiny glass pipe, could reprogram them to become amazingly versatile. The paper was reportedly rejected by Science, Nature, and Cell. Undaunted, the researchers modified it and then resubmitted to Nature, which published it. Nature won’t say what changes the authors had made to enable it to pass muster on a second review, but the paper didn’t stand the test of time. Labs around the world tried and failed to reproduce the work (and ultimately suggested how the original researchers may have been fooled into believing that they had a genuine effect). RIKEN, the Japanese research lab, retracted the paper and found the first author guilty of scientific misconduct. Her respected professor committed suicide as the story unfolded in the public spotlight.

image

Outright fraud also creeps into science, just as in any other human endeavor. Scientists concerned about reproducibility broadly agree that fraud is not a major factor, but it does sit at the end of a spectrum of problems confronting biomedicine. The website of the thinly staffed federal Office of Research Integrity, which identifies about a dozen cases of scientific misconduct a year, catalogues the agency’s formal findings on its website. A scroll down this page will introduce you to a former graduate student who, while working at the Albert Einstein College of Medicine, falsified data used in three journal publications and four meeting presentations. Investigators said she falsified dozens of image panels and fabricated numbers used in graphs and illustrations. An associate professor at Rowan University School of Osteopathic Medicine intentionally fabricated data leading to eight published papers and an NIH grant application. Investigators found that he “duplicated images, or trimmed and/or manipulated blot images from unrelated sources to obscure their origin, and relabeled them to represent different experimental results.”

Few of these stories ever make the news. And punishment is generally mild: frequently scientists agree to work under close supervision or are barred from getting federal research grants for a few years. Many are foreign scientists who vanish from the US research scene. The Office of Research Integrity lacks the staff to investigate many cases, so its modest output is a poor measure of scientific misconduct in the United States.

Another way to measure misconduct, as well as less serious offenses, is to watch for retractions in the scientific literature. Ivan Oransky and Adam Marcus started doing that as a hobby in 2010 on a blog they set up called Retraction Watch. Oransky figured they’d post a couple of items a month. Shortly after the blog started out, “Adam was quoted saying… ‘Our mothers will read it, and that will be fun,’” he said. But this did not turn out to be a sleepy enterprise. Retraction Watch appeared in the midst of a dramatic surge in the number of retractions. While there had been about forty retractions in 2001, Oransky said there were four hundred in 2010 and five or six hundred annually in the years since. The hobby swelled to a full-scale project, with staff and supporting grants.

Retraction Watch has fed a growing curiosity—and concern—about dubious research findings. Blog reporters chase down each new report of a retraction and try to get the backstory. Oransky and Marcus also maintain the Retraction Watch leaderboard, listing the scientists with the most retractions. Japanese anesthesia researcher Yoshitaka Fujii heads the list with more than 180 retracted papers—virtually every paper he ever published. That record leaves the competition in the dust. German anesthesia researcher Joachim Boldt weighed in with about one hundred dubious publications.

Retractions aren’t limited to obscure scientists in out-of-the-way institutions. Robert Weinberg at the Massachusetts Institute of Technology has retracted five papers, including one with over five hundred citations. A graduate student in Weinberg’s sprawling and highly competitive lab was the lead author on four of those papers. Weinberg says he called for an investigation after other members of his lab raised doubts about the student’s work. Weinberg concluded that “everything was tainted,” and nothing could be salvaged. “When people ask me about it I discourage them from trying to follow up on the work. That has been the one significant bump in the road I’ve had in terms of reproducibility.”

Published retractions tend to be bland statements that some particular experiment was not reliable, but those notices often obscure the underlying reason. Arturo Casadevall at Johns Hopkins University and colleague Ferric Fang at the University of Washington dug into retractions and discovered a more disturbing truth: 70 percent of the retractions they studied resulted from bad behavior, not simply error. They also concluded that retractions are more common in high-profile journals—where scientists are most eager to publish in order to advance their careers. “We’re dealing with a real deep problem in the culture,” Casadevall said, “which is leading to significant degradation of the literature.” And even though retractions are on the rise, they are still rarities—only 0.02 percent of papers are retracted, Oransky estimates.

David Allison at the University of Alabama, Birmingham, and colleagues discovered just how hard it can be to get journals to set the record straight. Some scientists outright refuse to retract obviously wrong information, and journals may not insist. Allison and his colleagues sent letters to journals pointing out mistakes and asking for corrections. They were flabbergasted to find that some journals demanded payment—up to $2,100—just to publish their letter pointing out someone else’s error.

It’s fair to ask why David Allison should be responsible for pointing out other researchers’ errors in the first place. There’s a very human answer to that question: scientists, like everyone else, hate to admit they are wrong—partly out of pride and partly because an error serves as a black mark against career advancement, tenure, and funding. “If we created more of a fault-free system for admitting mistakes it would change the world,” said Sean Morrison, a Howard Hughes Medical Institute investigator at the University of Texas Southwestern Medical Center. “You have to have a culture where you don’t feel the sky is going to fall on your head if you come out and say that [a finding] wasn’t right.”

Biomedical science is nowhere near that point right now, and it’s hard to see how to change that culture. Morrison said that it’s unfortunately in nobody’s interest to call attention to errors or misconduct—especially the latter. The scientists calling out problems worry about their own careers; universities worry about their reputations and potential lawsuits brought by the accused. And journals don’t like to publish corrections, admitting errors that sharper editing and peer review could well have avoided.

The resulting system can make a search for the truth a treasure hunt through the literature, with critiques often published in different journals and not necessarily cross-referenced. This is a result of using journal publications as the currency of science, with careers built on high-profile publications and torn down by corrections and retractions. “The literature should be more a living, evolving thing rather than full of contradictions,” Morrison said. But it’s hard to evolve away from an academic system that counts papers and is driven by a multi-billion-dollar publishing industry.

Often, errant studies simply fade away, sunk by their own weight, rarely referenced or used as the basis for ongoing research. They’re just a line on someone’s publication list and one more entry in the MEDLINE database of biomedical literature, which catalogs more than 23 million papers. But when there’s an error in a splashy paper or by a big-name lab, setting the record straight can be an ordeal.

In the case of the study comparing Asian and Caucasian gene expression discussed in Chapter 6, Josh Akey, Jeff Leek, and their colleagues raised questions shortly after publication of the original paper. They wrote up their critique in a letter to the journal editor. The original authors were given a chance to tell their side of the story. You could practically hear them speaking angrily through clenched teeth. First, Richard Spielman and Vivian Cheung admitted that they had not, in fact, placed the Caucasian and Asian samples randomly on each of the microarray chips they’d studied. (That would have been impossible, given that years elapsed between the experiments with Asian and Caucasian samples.) “We regret our incorrect statement that randomization was carried out and we appreciate this chance to correct the record,” they wrote. But their tone then turned prickly and defensive as they asserted that the batch effect “does not imply, or even suggest, that there is ‘systematic and uncorrectable bias.’”

They did not correct their conclusion that more than 1,000 genes are expressed differently in Caucasians versus Asians. Instead, they pointed to an independent study that had identified about thirty genes that were expressed differently and pointed to nine other genes in their study that differed between Caucasians and Asians. Clearly the paper did identify some racial differences, even if it couldn’t back their original claim that the difference involved a substantial share (about 25 percent) of the genes they had studied.

A peer reviewer aware of the batch-effect issue would never have allowed publication of a paper with this fundamental problem in the first place. But instead of retracting the paper, Cheung and Spielman left it standing in the scientific literature. It has now been cited more than three hundred times—in many cases by scientists who take it at face value. And the attempt by Akey and colleagues to correct the record wasn’t a pleasant experience.

“We had a lot of trepidation about writing that technical comment because Richard and Vivian were much more experienced, established investigators,” Akey told me, noting that he and his colleagues were just a few years into their careers. “It was not clear what the risk/reward ratio would be. With that said, everybody believes science is a self-correcting process, and ultimately we felt it was important to point this out and to let other people start thinking about some of these issues in more detail.” The message in their technical note circulated among biostatisticians and geneticists who analyze this kind of data, but Akey says it’s not at all clear that scientists who are trying to put these findings into a biological context understand the weakness in the Spielman/Cheung paper in particular or in other studies using similar methods. And the scientists who made the mistake were not happy to have it pointed out publicly. “Vivian at the time was really, really mad at us,” Akey told me. He said her attitude has softened over the years, but even so Cheung declined to discuss the episode with me.

image

“Most people who work in science are working as hard as they can. They are working as long as they can in terms of the hours they are putting in,” said social scientist Brian Martinson. “They are often going beyond their own physical limits. And they are working as smart as they can. And so if you are doing all those things, what else can you do to get an edge, to get ahead, to be the person who crosses the finish line first? All you can do is cut corners. That’s the only option left you.” Martinson works at HealthPartners Institute, a nonprofit research agency in Minnesota. He has documented some of this behavior in anonymous surveys. Scientists rarely admit to outright misbehavior, but nearly a third of those he has surveyed admit to questionable practices such as dropping data that weakens a result, based on a “gut feeling,” or changing the design, methodology, or results of a study in response to pressures from a funding source. (Daniele Fanelli, now at Stanford University, came to a similar conclusion in a separate study.)

One of Martinson’s surveys found that 14 percent of scientists have observed serious misconduct such as fabrication or falsification, and 72 percent of scientists who responded said they were aware of less egregious behavior that falls into a category that universities label “questionable” and Martinson calls “detrimental.” In fact, almost half of the scientists acknowledged that they personally had used one or more of these practices in the past three years. And though he didn’t call these practices “questionable” or “detrimental” in his surveys, “I think people understand that they are admitting to something that they probably shouldn’t have done.” Martinson can’t directly link those reports to poor reproducibility in biomedicine. Nobody has funded a study exactly on that point. “But at the same time I think there’s plenty of social science theory, particularly coming out of social psychology, that tells us that if you set up a structure this way… it’s going to lead to bad behavior.”

Part of the problem boils down to an element of human nature that we develop as children and never let go of. Our notion of what’s “right” and “fair” doesn’t form in a vacuum. People look around and see how other people are behaving as a cue to their own behavior. If you perceive you have a fair shot, you’re less likely to bend the rules. “But if you feel the principles of distributive justice have been violated, you’ll say, ‘Screw it. Everybody cheats; I’m going to cheat too,’” Martinson said. If scientists perceive they are being treated unfairly, “they themselves are more likely to engage in less-than-ideal behavior. It’s that simple.” Scientists are smart, but that doesn’t exempt them from the rules that govern human behavior.

And once scientists start cutting corners, that practice has a natural tendency to spread throughout science. Martinson pointed to a paper arguing that sloppy labs actually outcompete good labs and gain an advantage. Paul Smaldino at the University of California, Merced, and Richard McElreath at the Max Planck Institute for Evolutionary Anthropology ran a model showing that labs that use quick-and-dirty practices will propagate more quickly than careful labs. The pressures of natural selection and evolution actually favor these labs because the volume of articles is rewarded over the quality of what gets published. Scientists who adopt these rapid-fire practices are more likely to succeed and to start new “progeny” labs that adopt the same dubious practices. “We term this process the natural selection of bad science to indicate that it requires no conscious strategizing nor cheating on the part of researchers,” Smaldino and McElreath wrote. This isn’t evolution in the strict biological sense, but they argue the same general principles apply as the culture of science evolves.

A driving force encouraging that behavior is the huge imbalance between the money available for biomedical research and the demand for it among scientists, Martinson argues. “The core issues really come down to the fact that there are too many scientists competing for too few dollars, and too many postdocs competing for too few faculty positions. Everything else is symptoms of those two problems,” Martinson said. This a problem not only for people seeking jobs and promotions but for scientists fighting for grant money. Thirty years ago, about one-third of all NIH research proposals received grant funding. That figure has fallen sharply to around 17 percent. Among other things, that means the scientists who run labs often spend most of their time writing grant proposals rather than running experiments. Congress inadvertently made the problem worse by showering the NIH with additional funding. The agency’s budget doubled between 1998 and 2003, sparking a gold rush mentality. The amount of lab space for biomedical research increased by 50 percent, and universities created a flood of new jobs. But in 2003 the NIH budget flattened out. Spending power actually fell by more than 20 percent in the following decade, leaving empty labs and increasingly brutal competition for the shrinking pool of grant funding. The system remains far out of balance.

Compounding the problem, states have drastically curtailed financial support for universities. It’s common now for campuses to get only a small fraction of their funding from the states that proudly (and deceptively) affix their names to these institutions. To cite just one example, the marquee medical school University of California, San Francisco (UCSF), gets just 3 percent of its funding from the state of California. That means researchers must raise their own funds through grant applications, and if they fail in that increasingly competitive process, they can lose their jobs. Henry Bourne, an emeritus researcher at UCSF, says that at his high-ranking medical school, the administration no longer judges its scientists by the quality of their work; the bottom line is whether they can bring in enough money. “What we have is a Darwinian winnowing: We take them if NIH gives them a grant. And we don’t if they don’t. And that would be fine if the NIH was giving enough grants to ensure that we weren’t rejecting people who are actually very good.” But they’re not, he says. Universities typically take more than half of a scientist’s grant to pay for overhead expenses that states used to shoulder, back in the day when they contributed significantly to their flagship universities. Labor economist Paula Stephan at Georgia State University likens it to a shopping mall: The university owns the building and charges rent; the scientists have become the tenants, spending their grant money on rent as well as research assistants and materials. If they can’t keep bringing in the money, tough. They’re out of business.

Success typically requires building up a reputation by publishing a lot of flashy journal articles. To get into a high-impact journal, the story has to be unexpected (perhaps simply because it’s not correct) and exciting (which may or may not make it important). Psychiatrist Christiaan Vinkers and his colleagues at the University Medical Center in Utrecht, Holland, have documented a sharp rise in hype in medical journals. They found a dramatic increase in the use of “positive words” in the opening section of papers, “particularly the words ‘robust,’ ‘novel,’ ‘innovative,’ and ‘unprecedented,’ which increased in relative frequency up to 15,000%” between 1974 and 2014.

To get a paper in a top journal, scientists also need a squeaky clean story—free of peripheral observations that could raise any questions about the central findings and with no weak statistical findings. Of course, the real world of biomedicine is complex and untidy, so superclean studies actually merit suspicion rather than the public spotlight. “There is a lot of pressure for beautiful results or really clean results,” said Ken Yamada, a senior researcher at the NIH and editor of an academic journal. “And I think there used to be logic to it. Many years ago if somebody showed beautiful data, that almost always implied that they had to have repeated the experiment multiple times.” It strongly suggested robust results. “But nowadays if people adjust the appearance of things or just pick the single perfect example—and there are a lot of less convincing examples—there’s no way for you to know because all the data aren’t shown. It looks beautiful. It’s convincing. Pictures don’t lie”—or so we readily believe.

Yamada says this isn’t necessarily a deliberate attempt to deceive. “There’s a lot of misunderstanding [among scientists] about the integrity of scientific information. I personally think that part of it comes from just the removal of red eyes from photographs, making things look prettier just in everyday life.” But prettifying can easily go too far. For example, scientists employing a technique called single-particle electron microscopy use computer software to help them make sharper images. Sometimes scientists feed in a mathematical “model” representing what they’re expecting to see, so if something like that pops up in their field of view, the software will recognize it and make the image sharper. (Your digital camera does something like this when it stabilizes an image, adjusting the pixels that would make the picture blurry.) Maxim Shatsky and Richard Hall at the Lawrence Berkeley National Laboratory showed how this technique can lead researchers astray. They used the iconic photograph of Albert Einstein sticking out his tongue as the model they fed into a computer. The image processing software was programmed specifically to look for hints of the Einstein image and enhance any signs of it. Shatsky and Hall then fed the computer 1,000 images of nothing but static. Lo and behold, the software “correction” produced an unmistakable picture—of Einstein sticking out his tongue. Richard Henderson at the Medical Research Council Laboratory of Molecular Biology in Cambridge, United Kingdom, said he finds completely misleading images in the literature based on extreme “corrections” like this, including, in one instance, a bogus view of a vital protein component of HIV. “One must not underestimate the ingenuity of humans to invent new ways to deceive themselves,” he wrote.

The deep structural and funding problems throughout biomedicine are not news to anybody involved in the enterprise. In recent years, the topic has gone from a subject of idle shoptalk to a matter of serious discussion. In fact, that’s how Gregory Petsko came to be making tart comments about the journal “Hell” to young scientists in Chicago. He was speaking at a meeting put together by postdocs, including Kristina Martinez, to grapple with these existential questions. Chicago-area postdocs spent nine months in their not-so-spare time assembling a meeting under the rubric “Future of Research,” patterned after similar events in San Francisco, Boston, and New York. Postdocs realize they are inheriting a mess—a situation that not only jeopardizes their careers but makes it hard for them to solve the big problems and advance medical research, as so many have dreamed of doing.

During the morning’s presentation, the sixty-seven-year-old Petsko told them, “If we really care about the culture of science, it’s up to the old fogies of the world to do something about it.” His generation was running the show when the system broke. But these young scientists seemed determined to identify their own ways to reform the culture of biomedicine: improve the mentor-protégé relationship, find better ways to collaborate (and be rewarded for that), fight against the tyranny of journal impact factors, and avoid the pressure to exaggerate and hype results.

image

In 2014, some leaders of the biomedical enterprise decided it was time to start a serious conversation about these issues. Bruce Alberts (former president of the National Academy of Sciences), Marc Kirschner (chair of systems biology at Harvard), Shirley Tilghman (former president of Princeton), and Harold Varmus (then head of the National Cancer Institute) wrote a paper titled “Rescuing US Biomedical Research from Its Systemic Flaws.” They acknowledged that they could no longer let these structural problems fester and called for a meeting of minds across biomedicine to seek solutions. To prime the pump, they suggested a few of their own.

The article wasn’t simply a predictable lamentation about the need for more money in research. Even optimistic increases in funding won’t create the needed balance. Instead, scientists and their institutions need to make some hard choices—for instance, reducing the role of postdocs and hiring more scientists into regular staff jobs. The article became the centerpiece of a wide-ranging conversation among people in the field. Four months after it was published, the authors convened a planning meeting of about thirty people, representing universities, scientists, students, and government agencies, to set an agenda for an even wider discussion. The focus was not reproducibility per se but the field’s underlying pressures: the hypercompetitive environment of biomedical science. The meeting ended in discord, with no agreement even about how to approach a larger conversation. Attendees did agree, though, on one point: “Doing nothing is not an option.” The four leaders didn’t give up entirely: they created a small organization called Rescuing Biomedical Research to keep pushing for systematic change—including a discussion about reducing errors in science.

Those with the most economic power—the federal funding agencies—can’t simply impose solutions from above. “The NIH is terrified of offending institutions,” said Henry Bourne at UCSF. Conventional politics in part drives congressional funding for biomedicine. Members of Congress support institutions in their districts because local economies grow when federal dollars flow to universities and medical centers. Congress has also funded biomedical research because so many politicians have a sick relative or a dying friend and want to support the search for treatments and cures. But Bourne fears enthusiasm for that more foresighted reasoning has waned. “The government, and actually the American people, have suddenly realized that they’re spending a lot of money and cancer isn’t yet cured, so to speak. We bragged that we would cure cancer, and then it turns out we didn’t.” Bourne worries that “everyone suddenly thinks research is terrible and it’s not worth anything.” He doesn’t hold that view himself, naturally, but he does understand how frustration arises from the slow pace of progress.

Bourne has ideas about how to improve matters. For example, he’d like his university to establish an endowment to fund key professors’ base salaries to reduce the do-or-die scramble for research dollars. But he also believes scientists themselves need to change. “I think that is what the real problem is—balancing ambition and delight,” he told me. Scientists need both ambition and delight to succeed, but right now the money crunch has tilted them far too much in the direction of personal ambition. “Without curiosity, without the delight in figuring things out, you are doomed to make up stories. Occasionally they’ll be right, but frequently they will be not. And the whole history of science before the experimental age is essentially that. They’d make up stories, and there wouldn’t be anything to most of them. Biomedical science was confined to the four humors. You know how wonderful that was!” Hippocrates’s system based on blood, yellow and black bile, and phlegm didn’t exactly create a solid foundation for understanding disease. Bourne argued that if scientists don’t focus on the delight of discovery, “what you have is a whole bunch of people who are just like everybody else: they want to get ahead, put food on the table, enjoy themselves. In order to do so, they feel like they have to publish papers. And they do, because they can’t get any money if they don’t.” But papers themselves don’t move science forward if they spring from flimsy ideas.

There has never been a more important time to get this right. Biology is in the throes of a shift from small studies to big data. In this new world, quality is paramount. Scientists are starting to mine massive amounts of data to discover unsuspected links between genes, behavior, biochemistry, and disease. This is the foundation of what’s being called “personalized medicine” or “precision medicine.” The NIH and Barack Obama’s White House recognized this as a major new initiative. Indeed, it could be the future of medicine. Unfortunately, some of the foundational work has started off on less than rigorous footing. And without reliable, consistent information to work with, precision medicine may find itself facing the dreaded phenomenon computer scientists have memorably labeled “garbage in, garbage out.”