VIGNETTE: THE PERILS OF PREDICTIVE VALUE
Perception requires imagination because the data people encounter in their lives are never complete and always equivocal.
—LEONARD MLODINOW
“The data people encounter in their lives are never complete and always equivocal”—so says physicist and author Leonard Mlodinow in his book The Drunkard’s Walk, a meditation on randomness and how people choose to incorporate it, or ignore it altogether, as they go about their daily lives. But how much imagination is required for a person to perceive the equivocal nature of a blood test that informs them they are going to die?
A lot, as it turns out.
Yet these tests are administered all the time, and only infrequently do patients or doctors account for their equivocality. We saw this in the previous chapter when we looked at the prostate specific antigen test: positive tests only very rarely uncovered disease that would have led to terrible outcomes, and yet because of the equivocal data that the PSA testing produced in groups, many men ended up enduring fairly terrible treatments that they otherwise would not have undergone.
Another way of thinking about this is to ask the following question: What happens when we approach the middle of the spectrum of certainty? In this equivocal territory, it becomes vitally important to understand the size of the risks and the magnitude of the benefits. Again, we observed this with PSA: the risks of being overdiagnosed were quite real, and fairly common, while the benefit, if one exists at all, is on the order of one life saved per one thousand men over ten years’ time. When I go on to discuss screening mammograms in the following chapter, we’ll need to keep this in mind.
But how do we overdiagnose? What are its statistical mechanics? Why can’t we just develop a test that’s 99 percent accurate and be done with it?
In fact, we can, and we have. Most tests aren’t that good, but some are, and despite this we can still produce overdiagnosis. To understand this point is to understand at least part of the controversy about screening mammogram recommendations. So to more fully appreciate this phenomenon, let’s see how this played out when one patient learned the news of a routine blood test.
Mlodinow’s Story
On a Friday afternoon in 1989, one man in California received some very discouraging news from his doctor. The chances that he would be dead within a decade were “999 out of 1,000,” according to the doctor. “I’m really sorry,” the doc added as he relayed the news, by telephone.
The test was, of course, for HIV. It was a positive test as part of a routine insurance screen. The gentleman in question had been diagnosed with the virus that would eventually cause AIDS and lead to his demise. At that time, there was very little in the way of treatment: AZT, the first drug for HIV, had been approved two years before, but patients who took the drug got better initially only to succumb to illness as the virus became resistant to the drug’s effects. So-called triple therapy, which has allowed doctors to turn HIV into a chronic and manageable disease, was still more than five years away. This test signified a death sentence, although in the world of medicine at that time it constituted yet another routine blip in the ever-growing pile of cases as the HIV epidemic spread, especially in California.
There was one aspect of this test that was unusual, however. It involved the person being tested: the very same Leonard Mlodinow whose quote opened this chapter. Because of his training, Mlodinow understood the nature of numbers and statistics. After what must have been a very harrowing few days and perhaps weeks of concentrated thought and research on the HIV test, he was able to figure out something quite remarkable: his “positive” test for HIV could actually be interpreted to mean that it probably wasn’t positive after all. Which, in fact, was the case: Mlodinow wasn’t infected with HIV, and he has kicked around ever since, producing several very readable works of popular science to an audience grateful for the misdiagnosis.
How could this be? Overall, at the time the HIV test was, in fact, quite accurate. A person with HIV was very likely to have a positive test, and an uninfected person such as Professor Mlodinow was very likely to have a negative one. And yet the counterintuitive third fact is that, despite these two statistical truths, a random positive HIV test was very likely to be a mistake. The insurance company and the physician both got it badly wrong.
Mlodinow’s story throws a few features of modern medicine into sharp relief. The first and most obvious is the way in which highly accurate tests can nevertheless lead to deeply inaccurate interpretations. A second issue Mlodinow’s story raises is the process by which we understand what it is for a physician to “know” something. Part of why the story is so jarring is how spectacularly the physician fails Stats 101: rather than having 999 in 1,000 odds of being infected with HIV, Mlodinow relates that more likely he had about 1 in 9 odds. This is a whopper of a mistake, and what makes it so troubling is that we’re not inclined to think of physicians as the kind of people who make such critical errors. When coupled with his questionable judgment in relaying such news over the phone rather than scheduling a face-to-face visit in the clinic, the doc doesn’t come across as particularly professional.
But again, how could this be? The answer can be found in the idea of what constitutes predictive value. Predictive value refers to whether a given test result can truly be thought of as representing the presence or absence of disease—that is, if a test is positive and has a high positive predictive value, then that person probably does have the disease. Similarly, if a test is negative and has a high negative predictive value, then a negative test really is cause for reassurance. For instance, in a few chapters, we’ll see how the Lyme disease test has very good negative predictive value if a patient has been symptomatic for several months—if the test is negative, then whatever the problem is, isn’t Lyme.
However, accuracy and predictive value aren’t the same thing, and this is because predictive value is determined in part by the probability that someone has a disease. Unsurprisingly, this is referred to as the pretest probability. In other words, even when dealing with a fairly accurate test, if the pretest probability of someone having a given disease is low, then the positive predictive value will suffer. The lower the pretest probability, the lower the positive predictive value. Similarly, the lower a test’s accuracy, the lower the positive predictive value.
The reason Leonard Mlodinow’s positive HIV test was unlikely to be positive is because his pretest probability was low. You can’t perform a blood test to define someone’s pretest probability, but we can infer that it was low because Mlodinow was being tested as part of an insurance screen without any signs of illness. If, by contrast, he was experiencing unintentional weight loss, moderate fatigue, and a persistent nagging cough, especially as someone living in that place at that time, his pretest probability would have been much higher, and so the likelihood that his positive test was really positive would have been much higher.
Note however that I’m not talking about certainties in either direction. A positive result from a low positive predictive value test still does sometimes truly indicate that someone has disease, and a positive result in a high positive predictive value test is sometimes wrong. Uncertainty is ever present, but there is power in being able to quantify the uncertainty. Certainly someone receiving a diagnosis of HIV in the days before effective therapy would have been disconsolate by being told the odds the test was positive was 999 in 1,000; the same person who learned the chances of really having HIV were 1 in 9 would likely have breathed more deeply, though perhaps not altogether normally.**
In the next chapter, we’ll keep this concept of positive predictive value in mind, as I give numbers to the predictive value of screening mammograms, looking at the diagnosis of breast cancer in women who learn such a diagnosis in much the same way that Leonard Mlodinow learned his test meant that he was infected with HIV. I’ll quantify the levels of uncertainty surrounding screening mammograms, and in doing so illuminate one of the most impassioned topics in public health today.
* That was true of HIV screening in the late 1980s but has not been true for many years because of additional testing that has eliminated the false-positive problem for that disease. So, please, get your HIV screening test done!