The Creative Destruction of Medicine

FIGURE 2.1: The risk of death, heart attack, and stroke among 17,800 patients randomly assigned to receive either Rosuvastatin or placebo for over four years. Source: P. M. Ridker, “Rosuvastatin to Prevent Vascular Events in Men and Women with Elevated C-Reactive Protein,” New England Journal of Medicine 359 (2008): 2195–207.

This is as good as it gets. The trials of Lipitor and Crestor are state-of-the-art and considered exemplary proof for the broad use of these medicines for preventing heart disease. There is nothing particularly unusual about Lipitor or Crestor; they are commonly used drugs and heavily promoted. But what this represents is really population medicine, the antithesis of medicine directed at and for an individual. The push for “statins for all” has been dubbed “mass medicalization” and accounts for $26 billion a year of statin prescription costs.⁶ Instead of identifying the 1 person or 2 people out of every 100 who would benefit, the whole population with the criteria that were tested is deemed treatable with sufficient, incontrovertible statistical proof. Even the term “NNT”—numbers needed to treat—has been coined to denote how many people have to be given a therapy to identify the few who will derive the expected benefit. You can even look up the data for the numbers needed to treat for many medicines on the website www.theNNT.com. What constitutes evidence-based medicine today is what is good for a large population, not for any particular individual.

Another major deficiency of medicine is the use of experts to make recommendations or “guidelines” for a large proportion of decisions for which no or minimal data exists. These guidelines, typically published in major specialty journals, have a pronounced impact, as they are believed to represent the standard of care, even though they are based on opinion with a paucity of facts. In fact, this should be considered “eminence-based medicine.” As we are able to accrue more meaningful data and information on individuals, the hope is that we can override our dependence on such recommendations.

In the meantime, flawed evidence-based medicine of today is being advocated to provide immunity from medical malpractice. In a New York Times op-ed, Peter Orszag, formerly the director of the White House Office of Management and Budget, suggested that we could “provide safe harbors for doctors who follow evidence-based guidelines.”⁷ The 2009 U.S. economic stimulus act promoted comparative effectiveness in medical research and the new Patient-Centered Outcomes Research Institute. Unfortunately, the funding is disproportionate with the relative void of information, and the efforts that are being mounted are tied to population medicine. It’s ironic that the new initiative is called “Patient-Centered”—if only that were the case!⁸

It was August 1995, and at the Hotel de Crillon, one of the most extravagant hotels in Paris, the steering committee of the largest trial ever to be conducted had gathered to review the data for the first time. The trial, known as CAPRIE and involving more than 19,000 patients, was intended to test the efficacy of a new drug for treating vascular disease. It had to beat aspirin. Both drugs can prevent blood clots, albeit by different mechanisms. But the most important difference was price: Aspirin costs pennies per tablet. This new drug, if effective and ultimately commercially approved, would cost $4 a day. There were more than twenty experts in vascular disease in the room, as well as the senior management of the company that sponsored the trial, Sanofi. Everyone knew the stakes: the drug had cost hundreds of millions of dollars to test, but the rewards, if it was successful, were much bigger.

I was one of those experts in the room. Eager to review the data, I quickly leafed through the pages of the results book to get to the bottom line: did Plavix work better than aspirin? The answer was, in my mind, quite disappointing. There was only an 8.7 percent improvement in the end point, with only 2 patients per 100 benefiting.⁹ Still, it was considered really first-rate evidence-based medicine and potentially good for the population of patients who have vascular disease, and within a year the Food and Drug Administration (FDA) along with regulatory authorities all over the world approved the use of the drug—known as Plavix—for patients with vascular disease.

By 2010, Plavix had become the second-largest prescription drug by dollar sales in the world, just after Lipitor, with $9 billion of sales a year. But the FDA placed a black-box warning on Plavix to inform doctors that the drug may not work in patients if they are carrying particular gene variants. ¹⁰ It turns out that at least 30 percent of people cannot normally metabolize Plavix to convert it to an active drug. Without the intact functioning gene in the liver cells responsible for metabolism, known as cytochrome 2C19 (or CYP2C19), Plavix does not adequately suppress the platelets or prevent blood clots. So if a patient has a stent put in his coronary artery because of a blockage, and he carries an allele (alternative form) of the gene that does not allow the metabolism of Plavix, the risk of developing a blood clot of the stent is at least 300 percent greater. Although clotting a stent is not a frequent phenomenon, if it occurs, it is catastrophic and usually results in a heart attack or death. For individuals who carry one so-called loss-of-function allele, the problem can sometimes be overridden by doubling the dose of Plavix. With two copies of the CYP2C19 loss-of-function gene variants, however, the chance of not metabolizing the drug approaches 100 percent. Interestingly, there are also alleles that lead to faster metabolism of Plavix; if an individual carries two copies of the alleles, she would actually require a lower dose of Plavix.¹¹ But a lower dose of Plavix does not exist, and higher doses were never tested in a large trial.

Why did it take more than two decades for the realization that a significant proportion of individuals do not or cannot respond to the drug? It was actually known by the late 1990s that the response to this medication was quite variable. There was even an early study of healthy volunteers that tested the CYP2C19 gene alleles and showed that the 25 percent of people who did not respond to Plavix were more likely to have a loss-of-function DNA variant.¹²

There are two explanations for why the heterogeneous response to Plavix was not discerned twenty or more years ago. The first was population, evidence-based medicine. From the CAPRIE trial, with very modest impact, there was enough benefit to get the drug commercially approved and on track to be widely prescribed and promoted. Imagine if the patients in that trial had been genotyped, and the individuals who could not metabolize Plavix were excluded from participation. But the threshold for demonstrating benefit and proof that a drug works is relatively meager for a population, and there were enough responders in the big cohort for Plavix to pass the test and show overall improvement.

A second explanation is the use of the same dose for all patients, which is a common problem in the development of pharmaceuticals. How could it be possible that all individuals who take a medicine would respond in the same way to the same dose? With the extraordinary differences in age, gender, and weight, not to mention metabolism and genes, is there any drug that would be uniform in its effect with respect to dose? Yet the assumption that the same dose works for all patients is quite typical for a drug company. The priority is to keep it simple. If there are multiple doses, it means much larger trials and much more complex marketing. Pharmaceutical companies are aware of the lack of modification or tweaking of a dose, since when a doctor starts a patient on a medication it is relatively unusual for the dose to be changed. The starting dose typically represents the default dose. The convenience of a “one dose fits all”¹³ mentality could not be more advantageous for both physicians and the drug manufacturer. Consequently, it is one of the essential elements of population medicine today.

On the morning of Saturday, February 11, 1984, a fifty-seven-year-old, gray-haired woman of Polish descent, wife of a steel worker, who smoked a pack a day and had a family history of heart attack, arrived at the emergency room having suffered an hour of severe, crushing chest pain. She was sweating profusely, appeared very pale, and was terrified. Her electrocardiogram showed she was in the midst of evolving a large heart attack that could damage more than 40 percent of her heart muscle. At the time, the usual care for such a patient was oxygen, morphine to alleviate the pain, lidocaine to prevent abnormal, life-threatening heart rhythm, and hope for the best. But this lady’s timing was exquisite—she was about to become the first person to receive the genetically engineered clot buster tissue plasminogen activator (t-PA). This is a naturally occurring clot dissolving enzyme in the body, but when a blood clot underlying a heart attack or stroke forms, there is not nearly enough of it to dissolve the clot. The hope of the doctors treating her—including me—was that by rapidly supplying the protein to the vein while the patient was in the throes of a heart attack, it might be possible to restore the blood supply to the heart and prevent some or all of the damage to the heart muscle that was otherwise destined to occur.

On a Saturday morning at Baltimore City Hospital there were usually only a few doctors-in-training making rounds, and it’s most unusual to see the senior, attending physicians. But because this was a history-making event, and a team had been called in to do the emergency cardiac catheterization, more than twenty doctors came out of the woodwork. We proceeded to introduce a tiny tube or catheter through the artery in the leg and threaded it up to the heart via X-ray guidance. At the spot where the arteries to the heart take off, a bit of contrast dye was shot through the catheter to see if the left anterior descending artery was cut off. Indeed it was. Now we were to give t-PA in her intravenous line, which had been placed in her right forearm. Since no patient had ever received human recombinant t-PA, we had no idea of the effective dose; the protocol called for giving 20 mg and then taking additional pictures of the artery to see if the blood supply had been restored. After we gave t-PA, we took a considerable number of pictures with dye shot down the vessel, and about fifteen minutes later the blood flow came back. The patient’s chest pain was abating, her color was restored, her EKG was improving, and the large group of doctors and nurses, in the lab and the adjacent control room, were cheering. I started to cry. Little did I know that the dose that we used of t-PA was totally insufficient to open the artery, and that it was the repeated injections of the dye that likely was responsible for restoring the patient’s blood supply to her heart. Later we would learn that it took an average of forty-five minutes and 100 mg of t-PA to open up the heart attack arteries in most patients. One more chapter of medical serendipity had been written.¹⁴

A few years later, when there were enough data showing that the blood clots were rapidly dissolved with t-PA in a high proportion of patients, the FDA advisory panel, consisting of many physician experts in the field of cardiology, weighed the evidence for approval for treating heart attack patients. At the hearing in 1987, the morning was dedicated to considering streptokinase, an inexpensive clot-dissolving agent derived from streptococcal bacteria, which had been around for decades. A large trial of almost 12,000 patients, involving most of the hospitals in Italy, had shown that streptokinase saved lives—about a 20 percent reduction in deaths—compared with not giving any clot-dissolving therapy.¹⁵ The panel voted unanimously to approve intravenous streptokinase for evolving heart attacks. But in the afternoon, when t-PA was considered, there were only data for surrogate end points. The studies showed that t-PA dissolved clots and restored blood supply, far better than streptokinase, and a small trial performed at Johns Hopkins took this a step further by demonstrating that patients receiving t-PA, compared with placebo, wound up with better heart function. But there were no data to say that lives were saved, and so the panel declined to recommend t-PA for approval.

The aftermath was a memorable one. The meeting was held at a large auditorium in the days prior to cell phones. There were pay phones in the back of the room, and one could hear stock brokers on the phones calling in—and yelling—to sell shares of the manufacturer of t-PA, Genentech. The Wall Street Journal published an editorial entitled “The Flat Earth Committee.” ¹⁶ Here one of the original biotechnology companies, with its darling genetically engineered product t-PA, was going down because of the unacceptability of surrogate end point data.

Eventually, through heavy lobbying and submission of more data related to the avoidance of congestive heart failure, t-PA did garner FDA approval the following year. But it cost $2,200 a dose, as compared to $300 for streptokinase, and such a cost difference didn’t meet the standard of heightened efficacy, a cornerstone of evidence-based medicine for heart attack therapy. The vulnerability of t-PA was further highlighted when subsequent large trials from Europe directly compared t-PA with streptokinase but did not show any meaningful difference for survival. Doctors began to question whether dissolving clots more rapidly and efficiently really made a difference. Maybe the clots reaccumulated after t-PA. Maybe streptokinase did other things that were beneficial besides dissolving clots. Maybe the regimen of t-PA was inadequate. Confusion set in for doctors treating heart attacks and the medical experts who were left without a clear explanation for why t-PA didn’t prevail. But one thing was clear: t-PA was not going to be used unless it could be definitely shown to incrementally save lives.

By 1990, the big t-PA trial known as GUSTO was planned, and after a pilot study assured feasibility, a trial of over 41,000 heart attack patients in twenty countries was launched in 1991 and completed in 1993.¹⁷ At the time, this was the largest clinical trial to have ever been organized and initiated in the United States. The end point was dead or alive status of the heart attack patients at thirty days after treatment. T-PA turned out to be better than streptokinase by 15 percent, reducing the death rate from 7.3 percent to 6.3 percent. That means that for every 100 patients, 1 would benefit more from t-PA than from streptokinase. The only problem was identifying who those patients were. But we didn’t know that at the time of the trial, and we still don’t.

Several years ago a colleague of mine at the Cleveland Clinic had an abnormal prostate-specific antigen (PSA) blood test result during his annual physical examination. After it was confirmed, he underwent a prostate biopsy to determine whether he had developed prostate cancer to correlate with the elevated PSA. This is not an easy procedure to go through: a catheter is inserted in the genital tract through the penis, and a bioptome, a small pincer-shaped tool, takes multiple tiny pieces of the prostate tissue (which involves considerable pain) to be studied for any cancerous cells. But when it was found that there was no evidence of prostate cancer (a “false-positive” PSA), he had to have serial prostate biopsies every six months for the next year to be sure that any prostate cancer was not missed. And my friend is only 1 of the 250,000 men every year in the United States who has a false-positive PSA test and then undergoes multiple prostate biopsies. Thirty million men in the United States continue to have their PSA test every year.¹⁸

Dr. Richard Ablin, a pathologist now at the University of Arizona, discovered the PSA in 1970. Forty years later he wrote an op-ed in the New York Times entitled “The Great Prostate Mistake,” in which he stated, “The test’s popularity has led to a hugely expensive public health disaster.”¹⁹ I have never known the inventor of a clinical test to declare a public health disaster based on that test. Inventors notoriously support their invention without limits. So why in this case did Ablin issue a public health warning?

Prostate cancer is extremely common in men, with over 15 percent eventually carrying the diagnosis. But only 3 percent of men actually succumb to prostate cancer.²⁰ So there is considerable prevalence of nonaggressive prostate cancer, and when this is diagnosed, typically surgery is performed to remove the prostate gland, followed by radiation and other cancer-ablating procedures. Just the PSA test alone costs the United States $3 billion per year—and billions more when the cumulative costs of all the biopsies, surgeries, treatments, and the complications of the surgery such as urinary incontinence or impotence are factored in.²¹ Ablin’s editorial plea ended with: “The medical community must confront reality and stop the inappropriate use of PSA screening. Doing so would save billions of dollars and rescue millions of men from unnecessary, debilitating treatments.”

Nonetheless, mass screening for early detection of cancer is one of the most accepted rituals of health care in the United States. The current recommendation is that all men over age fifty should have their PSA checked every year. Even though there has been some recent public debate, the current recommendations are that all women should undergo a yearly mammogram after age forty. Everyone should undergo a colonoscopy at age fifty and every five years thereafter. Similar to the problem of “one dose fits all,” how can we possibly be viewed as all having the same risk profile, given the differences in our biology and environmental exposures?

The PSA in men has a parallel in breast cancer screening using mammography in women. For mammography, the trade-offs of screening and false-positive results are now broken down by age. For every 1,000 women who undergo mammography between ages forty and forty-nine, 98 (about 10 percent) will have a false-positive mammogram, 60–200 women (the range indicates variability in multiple studies) will undergo an unnecessary biopsy, and 84 per 1,000 women screened will have to undergo additional imaging, typically involving magnetic resonance or ultrasound. Only 1 per 2,000 women screened might avoid death from breast cancer with screening. In later decades of life, the numbers don’t change very much. Even in women ages seventy to seventy-nine, false-positive mammograms occur in 69 out of every 1,000, and additional imaging is necessitated in 64 per 1000. Proven overdiagnosis, representing unnecessary surgery, chemotherapy, or radiation, rises from 1–5 per 1,000 at ages forty to forty-nine, to 1–7 per 1,000 women ages fifty to fifty-nine.²²

Here is more evidence that population medicine, in this case mass screening, disregards individual variability and promotes considerably more unnecessary medical testing and procedures. Beyond the tabulation of data on false-positive results, the emotional toll to the woman and her family after notification of an abnormal mammogram is considerable and impossible to quantify.

Back in 2003, two British physicians, Nicholas Wald and Malcolm Law, published a paper on the “polypill.” One pill with six drugs in it: a low-dose statin for lowering LDL cholesterol; three blood pressure medications, each at half the regular dose; the vitamin folic acid; and low-dose aspirin. Wald and Law asserted that if everyone over fifty-five took this polypill, there would be 88 percent reduction of coronary heart disease and 80 percent reduction of stroke. A notably strong statement was contained in the report: “It would be acceptably safe and with widespread use would have a greater impact on the prevention of disease in the Western world than any other single intervention.”²³ This assertion of profound benefit, completely theoretical and untested, ignited controversy in medical circles. Further, the idea of giving a pill containing six medications to all people over a certain age seemed quite foreign.

One could liken this concept to the use of fluoride in the water supply, which was initiated in the United States in 1945 and has been documented to benefit the population, with an overall 15 percent reduction of dental cavities. Nevertheless, in many places around the world, there continues to be opposition to water fluoridation for ethical concerns: it is seen as forced use of medication on the masses.²⁴ The only known side effect of fluoride in the water is the condition of dental fluorosis, which is usually characterized by tiny white stains of the teeth. Still, with an inadvertent overdose the teeth can be grossly discolored. Here, if properly regulated and administered, the balance of benefit and harm weighs in favor of the practice of water fluoridation. But how does that compare with a polypill?

It took several years after the initial Wald and Law proclamation before a polypill could be produced, ultimately made in India, as such a product would not likely be a manufacturing target for large, traditional, pharmaceutical companies. Clinical trials testing got under way in 2010, so it will be years before we know whether this population medicine strategy has any merit.²⁵ Perhaps by promoting adherence and decreasing the cost of drugs it will have merit in parts of the world where taking and affording multiple medications is especially problematic. One can even make a good case for this being the profile in the United States, since only 50 percent of patients actually adhere to their prescriptions, and the costs continue to skyrocket.

If successful, we may someday have the modern version of fluoridation of the water in the form of a multidrug pill taken every day. (We already have ample evidence of current prescription drugs in our water supply.)²⁶ However, the notion of encapsulating multiple drugs into one, at unconventional and fixed doses, appears to be the virtual opposite of medicine for individuals. Instead of identifying the condition to prevent or treat, every person is given multiple drugs that carry known side effects, compounded by the potential for drug interactions. Rather than attempting to focus on directed, tailored prevention or therapy for a particular person, the polypill approach capitalizes on benefits shown at the population level, but with the same problems that confront all other population-based medicine.

As problematic as these population-based investigations have been, they at least have conformed to the best-in-class model for medical studies. Consumers, unfortunately, are typically getting data from small, observational studies, published in obscure journals or not at all, in which there is no real control group or no randomization, and shaky end points. For example, one study compared two hundred individuals who took vitamin E for two years and had less heart disease than two hundred people who said they didn’t take vitamin E. Even very large-scale observational studies have led us astray. A study of 87,245 nurses suggested that using vitamin E supplements would reduce the incidence of heart disease by 30 to 40 percent. And a Finnish study of 5,133 men with fourteen-year follow-up suggested the same benefit. But when the vitamin E story was subjected to several randomized placebo-controlled, double-blind trials, there was no indication of benefit whatsoever. In fact, in one such trial of 10,000 patients, the participants actually had a surprising 21 percent higher rate of developing heart failure.²⁷

Observational studies have misled more than once. One of the most impressive mistakes relates to hormone replacement in women, which for decades was widely recommended to reduce heart disease. The leading manufacturer of hormone replacement, through the use of ghostwritten articles in medical journals, propagated some of this practice. When randomized trials were finally performed, the recommendation turned out to be completely off base. The Women’s Health Initiative trial of over 16,000 healthy postmenopausal women compared the combination of estrogen and progestin to placebo and found significant increases in breast cancer, heart disease and heart attacks, strokes, and dangerous blood clots—far overriding the benefit of less colon cancer and fewer hip fractures. The results of the trial were so negative that it was stopped prematurely, at 5.6 years (instead of the planned 15 years) of follow-up. New results released in 2011 continue to engender confusion, suggesting disparate outcomes with hormone replacement as a function of what age the treatment was initiated.²⁸

In 2005 John Ioannidis, now at Stanford University, published the article “Why Most Published Research Findings Are False” in the journal PLoS Medicine, sending chills through the academic medical community.²⁹ His conclusions are in keeping with what has been reviewed here: (1) the smaller the studies, the less likely the research findings are to be true; (2) the smaller the effect, the less likely the research findings are to be true; (3) the greater the financial and other interests, the less likely the research findings are to be true; (4) the hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true. A similar note was sounded by Jonah Lehrer in the New Yorker, in a piece entitled “The Truth Wears Off.” The significant issues in science of replicability, the subsequent “regression to the mean” after initial results are impressive (often referred to as “the winner’s curse”), the bias of the researchers, and the bias of what is actually published (mainly positive results) led him to the following conclusion: just because an idea can be proved doesn’t mean that it’s true.³⁰

I don’t want to be excessively negative, but the right assumption in reviewing any new data presented to consumers is to question it. Unlike our legal system, in which a defendant is considered innocent until proven guilty, new scientific or medical evidence has to refute and transcend the “null hypothesis.” In other words, consider the new findings null and void unless you are thoroughly convinced that the evidence is compelling. I coined the term “litter-ature” to denote that too much of the medical literature is littered with misleading and false-positive findings. That there is simply too much literature (and litter-ature) is evidenced by the statistic that only 0.5 percent of the 38 million published papers are cited more than two hundred times by others, and half were never cited. Moreover, when pooled analyses of prior studies are published, many relevant papers are excluded.³¹

All of these problems can be seen in the case of the medicines Zetia and Vytorin, which are the trade names, respectively, for ezetimibe and ezetimibe plus simvastatin. These prescription drugs help to lower LDL cholesterol in the blood. Ezetimibe was approved in 2002 on the basis of small (but randomized) studies demonstrating only a surrogate end point—that showed LDL cholesterol was reduced 19 percent—rather than a proven decrease in incidence of disease or death.³² The FDA, in approving the drug, simply assumed that lowering LDL cholesterol by any means would be a good thing for patients. The major outcome trial (notably dubbed “IMPROVE-IT”), which tests whether the drug actually benefits people, was not started for several years and will not be complete before 2012 at the earliest. Meanwhile, in 2008, small, randomized studies began to show that ezetimibe had no effect on artery plaque development; moreover, there were signs that this drug was linked to a higher risk of fatal cancers.³³ Media reports have been panicky and remarkably inconsistent, but professional organizations such as the American College of Cardiology and the American Heart Association, which receive large financial support from the manufacturers of the medicine, simply proclaimed that ezetimibe was safe. Annual sales in the United States reached $5 billion.

How do we get out of this mess? Better studies are part of the solution—although not all of it. We need real evidence based on individuals, not populations. Fortunately, our ability to get just that information is rapidly emerging, beginning an era characterized by the right drug, the right dose, and the right screen for the right patients, with the right doctor, at the right cost. Medicine for the common good is not good enough. Now let us see how to get something better.