Chapter 11

Big Data

What would life be without arithmetic, but a scene of horrors?

—Rev. Sidney Smith, 1835

On a sultry afternoon in the summer of 2014, I visited the Washington, DC office of Karen DeSalvo, the new director of the Office of the National Coordinator for Health Information Technology (ONC). DeSalvo, an internist and the former commissioner of health for New Orleans, had a full agenda as she tried to reposition the ONC for the post–Daddy Warbucks era (the last of the 30 billion HITECH dollars were doled out in late 2014), as well as navigate the swirling controversies over Meaningful Use, interoperability, usability, and more. After we discussed these weighty topics, she pivoted the conversation; she wanted to talk about underwear. Intelligent underwear. “Sensing underwear,” she said jauntily, “that’s my favorite thing. I was driving yesterday, thinking of all the uses for sensing underwear!”

I was a little taken aback. First of all, sitting in a large, sterile government office on the seventh floor of the Hubert Humphrey Building on Independence Avenue, within a stone’s throw of the Capitol dome, the topic was a bit more tabloid than I was expecting. Moreover, I’m not sure that I want my underwear sending off signals.

As DeSalvo sees it, smart underwear has its place. “If I had a parent with diabetes in a nursing home,” she explained, “there are things I’d want to know, things that might be helpful.” These things include hydration, body temperature, and heart rate. One can envision other uses for sensing underwear, including monitoring a recovering alcoholic for a relapse and checking to see whether a forgetful patient has taken his medications.

The improbable story of intelligent underwear began with a June 2010 article titled “Thick-Film Textile-Based Amperometric Sensors and Biosensors” and published in Analyst, the official journal of Britain’s Royal Society of Chemistry. This article did not, at first glance, appear to be one that would titillate audiences around the world, nor one that would send the healthcare IT Hype-O-Meter into the red zone. Yet the findings were remarkable. After purchasing underwear from local department stores, a team of bioengineers from UC San Diego used advanced textile screen-printing techniques to fuse carbon-based electrodes into the elastic waistband. In the article, they presented their findings: the electrodes, which held up under moderate tests of bending and stretching (and apparently can survive both bleach and your Maytag’s various cycles), reliably measured concentrations of certain body chemicals present in sweat that correlate with levels of blood alcohol and stress.

The UCSD researchers weren’t aiming for the National Enquirer, but their paper created a media buzz regarding other potential uses for “wearables.” Exercise pants that give you a heads-up if you’re working out your right- and left-sided muscles asymmetrically. Headgear that can follow your sleep patterns, and even send a text message to your office staff alerting them that you might be grumpy after a fitful night. Socks that can monitor a diabetic’s capillary flow, issuing a warning when they’ve detected an elevated risk for a foot ulcer caused by poor circulation.

And the flow of electrons need not be unidirectional. Companies are working on sensors that not only identify problems like high stress levels or low blood pressure, but also deliver appropriate treatments, perhaps through the skin. Even the underwear can go both ways: a product called “Fundawear” allows a person to signal another person’s underwear to vibrate (in several different strategic locations) via an iPhone app. This immediately jumps to number one on the list of reasons not to lose your phone.

Image

Creating sensors to measure a wide range of biological phenomena, like your stress level or the physiologic effects of certain drugs, was once a daunting engineering problem. But over the past five years, these challenges have been overcome through the development of gizmos ranging from the tiny accelerometers in your Fitbit or Jawbone to nanosensors that can be safely ingested. And, of course, the outputs of all these miraculous devices can now connect to our smartphones and to the Internet. This means that the so-called Quantified Self movement is shifting from a technical problem (how to capture the data) into an analytics question (how to make sense of the data). And this is where the hope butts up against the hype.

Image

The consulting firm Gartner defines big data as “high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery, and process optimization.” The concept has been well known in the consumer space for years, in ways that are sometimes obvious to shoppers (as in Amazon’s book recommendations) and sometimes less so (as in the Facebook advertisements for Pampers that begin popping up soon after you purchase a pregnancy kit from CVS). However, its introduction into healthcare is recent, owing mostly to the availability of five relatively new data sources: clinical data from EHRs, genetic data, government data such as physician billing patterns, data from social media and digital devices, and data from sensors, both wearable and otherwise.

Shahram Ebadollahi, IBM’s chief science officer for healthcare, told me of the company’s early attempts to sell the concept of big data to healthcare organizations, as recently as a few years ago. “We were trying to convince people of the merits of how to use their data, how to derive insights, how to feed that into daily practice.” But few were interested. “Now they are saying, ‘Okay, we have the data. We don’t know what to do with it.’”

Ebadollahi finds that the biggest problem he and his Watson team are facing is not too much data—after all, Watson can sift through the equivalent of one million books in a second—but too little. In large healthcare datasets, there might be 1,000 pieces of data collected on at least one of the patients, whether it’s demographics like age, gender, income, and education level; clinical problems like colon cancer or Crohn’s disease; or physiologic data such as heart rate and blood count. But 90 percent of the cells in the spreadsheet are bare, meaning that a piece of data that’s available for some patients is missing for others. On top of that, many data points are “noisy”—potentially inaccurate for a variety of reasons, ranging from keystroke errors to sensor malfunctions. Big-data folks call this part of their work “data wrangling” and sometimes refer to themselves as “data janitors,” since so much of their time is spent preparing datasets for analysis rather than on the analyses themselves.

Ebadollahi uses a TV analogy to describe how IBM is tackling the seemingly arcane but crucial problem of handling missing data. Let’s say that CNN was interested in identifying all the clips in its vast library that show an airplane taking off—and let’s assume that the computer can’t yet distinguish an airplane from other large objects, like buildings or cars. The network could hire interns to watch and index every video, but that would be wildly inefficient. Instead, IBM’s approach begins by pinpointing “concepts” that the computer can readily identify, such as big chunks of asphalt, blue sky, or clouds.

Watson then uses machine learning to detect this evolutionary pattern—asphalt and trees giving way to sky and clouds over the space of a few seconds—to guess that the clip shows an airplane taking off. This is also how it works in healthcare, as Watson fills in the empty cells by mining the records of all the patients and making assumptions about what the likely value would have been for a given patient had it been measured, a statistical concept known as “imputing.”

Image

Interestingly, when it comes to analyzing healthcare information, “There are two big data problems we’ve observed,” said Eric Brown, the Watson team’s lead engineer. First, there’s the obvious one: the literature of medicine, which currently contains about 24 million records and expands at a rate of 2,100 articles per day. The idea that a human being could keep up with this flood of literature is laughable.

The second problem is more surprising but equally daunting: the data contained within a patient’s own electronic health record. For a complicated patient, the EHR can easily contain thousands of pages, with both structured (such as laboratory test results) and unstructured (physician narratives) data. IBM and other companies13 are developing techniques to mine these records, which is a more challenging and subtle task than simply performing a Google search of them.

For example, if I’m interested in a patient’s risk for heart disease, I need to quickly determine his prior history of angina or heart attacks, his family history, the presence of key risk factors, and other clues to the disease that might be hiding in a doctor’s note, a laboratory study, or a cardiac catheterization report—any of which may have been performed last month or during the Clinton administration. Said IBM’s Brown, “It doesn’t take much data to create a big data problem for the human brain. For a physician to take in and understand a 10,000-page longitudinal patient record in the two to three minutes he has to prepare for the visit with that patient is a problem. And then we want to allow the physician to interact with that information during the visit in a way that doesn’t create a barrier between the doctor and the patient because the doctor’s head is buried in the EHR.”

While much of the promise of big data involves improving care for individual patients, there is another, more ambitious, goal: the creation of a “learning healthcare system,” one that constantly mines all its data for patterns and insights to improve the organization of clinical practice. Over time, this kind of analysis is likely to help determine optimal staffing patterns, inform efforts to prevent hospital-acquired infections, and estimate prognosis and risk factors for bad events, ranging from heart attacks to readmissions.

Moreover, big-data techniques seem poised to become powerful research tools, not only to test traditional approaches, such as whether radiation therapy or chemo works better for patients with a certain cancer, but also to assess new sources of data, from genetic analyses to Facebook “likes.” In the pre-big-data era, answering such questions required clinical trials, in which patients were randomly sorted into two different arms, treated with one alternative or the other, and followed over time.

But these new techniques create the opportunity to answer such questions by observing what happens to real patients who, for a variety of reasons, received different treatments. Using a Harry Potter analogy, Michael Lauer of the National Heart, Lung, and Blood Institute called this “Magic in the Muggle world.” He added, “It will allow us to do dozens, hundreds of large-scale studies at a very low cost.” These kinds of analyses have already demonstrated that what we once thought of as lung cancer is actually a series of different cancers, each with a particular genetic signature. These genes, rather than the patient’s age or even the appearance of the tumor under the pathologist’s microscope, may prove to be the key predictors of outcomes and the best guide to treatment.

IBM’s Ebadollahi described how Watson makes sense of large datasets. One of its techniques, known as “patient similarity analytics,” is analogous to the method that allows Amazon and Netflix to say, “Customers like you also liked … .” In this case, though, it is, “Patients like this had less knee pain after taking infliximab than methotrexate,” or, “Patients like this lived a year longer than patients who lacked these attributes,” or “Patients like this became septic and hypotensive within 24 hours (even though they didn’t look so bad to their nurse or doctor),” or even “Patients like this did better when they saw Dr. Robinson instead of Dr. Reynolds.”

Ebadollahi sees such an approach, sometimes called “practice-based evidence,” as complementing, not supplanting, the more traditional “evidence-based practice,” with its focus on using published clinical research to guide prognosis and treatments. To illustrate how it works, he described a theoretical database containing 100,000 pieces of data on every patient, including clinical, demographic, financial, genetic, and perhaps even social media–derived data. “Out of those 100,000,” he said, “I can determine the 100 things of greatest importance to the question I’m trying to answer. Once I have that, I can see what happened to people who were similar to my patient on those 100 variables, and what course of action I should take.”

Even with a tool as potent as Watson, this is a harder problem than it might seem. First, there’s the missing data problem, as well as the daunting signal-to-noise issues. There also are vagaries in the behavior of certain variables—even those as seemingly straightforward as age. For example, the prognostic importance of a 10-year bump in age is far from linear: it’s trivial going from age 25 to 35, but sizable going from 70 to 80. Computer scientists (at IBM and elsewhere) are trying to create models that account for nuances like these.

Moreover, computers are likely to spit out strict goalposts, but patients and doctors may have preferences that challenge these hard guidelines, giving rise to the commonly voiced objections about “cookbook medicine.” Larry Fagan, the retired Stanford informatics expert, was being treated for cancer when I interviewed him in 2014. Coincidentally, years earlier he helped develop a computerized algorithm known as ONCOCIN, which guides treatments at Stanford and other cancer centers. Fagan’s cancer was being treated with chemotherapy, but then his platelet count (blood cells that promote clotting) came back at 97,000 (normal is about 150,000). ONCOCIN recommended a significant reduction in his chemotherapy dose, which Fagan knew might compromise the effectiveness of the treatment. He also knew that the platelet count can bounce around—it’s not unheard of for it to vary by 20,000 to 30,000 from one test to the next.

“My nurse-practitioner checked the algorithm and said, ‘We have to do a dose reduction.’ I told her that was ridiculous, that the uncertainty associated with the platelet count meant that she should check it again.” She did, and the repeat result was 130,000. Fagan received his full dose of chemo.

After a lifetime of building and studying medical IT systems, that was the moment when Fagan fully appreciated the tension between computer rules and their clinical context. “There was this disconnect between the output of the algorithm and a human who wants to receive the full dose of chemo if it’s appropriate. I didn’t want the answer the program gave me. It was only when I became a patient that I fully appreciated the fuzziness of medicine.”

Image

Big data is intoxicating stuff, but it’s worth pointing out that, despite the hype, this work is in its infancy, at least in healthcare. There is no question but that we will soon be awash in data, and that our ability to sift through it is improving rapidly. But will it be transformative? I remain skeptical, at least in the short term.

While our EHRs will contain vast amounts of information, big-data techniques may be no better at sifting through bloated copy-and-paste-ridden notes than are the frustrated doctors trying to do so today. Much of the data in EHRs continues to be collected for the purpose of creating a superior bill, and using this waste product of administrative functions for clinical decision making can lead to a GIGO (garbage in, garbage out) problem, even with fabulous analytics.

Moreover, while all of our genetic data may eventually help guide prognostic and treatment decisions, today there are only a handful of gene mutations, mostly in oncology, that have proved to be unambiguously valuable in real-life practice. Adding to the challenge, the privacy issues raised by widespread mining of individuals’ data have not been resolved. Even if the data are “de-identified” (meaning that the patient’s name and other readily identifiable information are stripped from the dataset), studies have shown that a patient’s identity can easily be reconstructed through analysis of a small number of variables, including blood test results—a modern fingerprint in 1’s and 0’s.

As for big data drawn from sensors, as you might have guessed, Vinod Khosla is bullish. He told me about a prototype watch that he’s been testing. The watch bristles with tiny sensors. “It collected 19 million data points on me over a couple of days,” he pointed out. Khosla conceded that today, a good physician could judge his health more accurately—by talking to and examining him and running some standard tests—than a computer could by analyzing these bits of data. But, he continued, “Once we have readings on 100 million people, it will become more valuable. It’s not the data. It’s the complex math that creates insights about that data.”

And yet … and yet, as I reflect on the complexity of the problem, my instincts tell me that Khosla might not quite get it. In The Checklist Manifesto, the author and surgeon Atul Gawande recounted a study that vividly illustrates this complexity. In a single year, the trauma centers in the state of Pennsylvania saw 41,000 patients, who had 1,224 different injuries. Taken together, there were 32,261 unique combinations of injuries. Gawande described the findings to me in more detail:

Someone stabbed in the eye, and stabbed in the belly. Another person had a seat belt injury with a cardiac contusion and a long-bone fracture. And by the way, he’s on Coumadin, so there’s an anticoagulation problem.

The automated algorithm can tell you, “Well, here’s the best combination to try to treat.” But it’s 2 a.m. and my radiologist, who can do the angiography, is 40 minutes away. Because he just moved, and his house is now in a different place. Sorry.

This is life as a trauma surgeon. It’s what it’s like when I’ve been on call. The computerized database is not going to be kept up to speed fast enough. It’s not going to take all the factors of human life into account. This is the nature of fallibility, and we’re going to be at that juncture all the time. A human being is still going to have to be there, putting it together. Are they fallible? Absolutely. But the data are more fallible.

Image

Perhaps someday, a company—maybe IBM, or Apple, or Google, or one that hasn’t yet been born—will be able to predict my clinical outcomes or guide my treatments by analyzing my pulse rate, my steps, my thyroid level, my diet, my genes, my underwear sweat, my Visa bill, my Twitter feed, and my golf handicap. But that day is not today, and it’s unlikely to be tomorrow.

For now, I see big data as a crucial area for research and development, one that is likely to bear fruit over time—particularly as EHRs become ubiquitous and somehow linked to patient-generated data from sensors and elsewhere, and as developers figure out how to integrate these tools with the habits and work flows of real people. A wise person once observed that we usually overestimate what can be done in a year and underestimate what can be done in a decade. To me, big data in healthcare meets that descriptor perfectly.

If I am right, then, for the foreseeable future, the Quantified Self movement is likely to make its biggest mark among folks who have a bit too much time (and money) on their hands. Mark Smith recalled his experience as a member of the Qualcomm advisory board. Since the company specializes in providing—and profiting from—24/7 connectivity, there were many lively discussions about the value to consumers of minute-to-minute monitoring of things like blood sugar and heart rate. “There was one meeting where somebody said, ‘It’s like your credit score—you can hit a button and get your number!’ I asked him, ‘And how often do you check your credit score? Maybe once every six months, not five times a day.’”

“I would argue that if you’re checking your blood pressure every hour, you’re a self-monitoring narcissist,” Smith said. “Not an average human patient.”

Image

At the end of my visit to IBM’s research headquarters in New York’s lovely Hudson Valley (the place where the machine that beat the Jeopardy champions was built), a few members of the Watson team told me about those thrilling days when it became clear that their invention had succeeded in mastering a task that few had thought possible. Winning the game was a joyous occasion for the IBMers, who celebrated with a small victory party.

I asked Eric Brown, who worked on the Jeopardy project and is now helping to lead Watson’s efforts in medicine, what the equivalent event might be in healthcare, the moment when his team could finally congratulate itself on its successes. I wondered if it would be the creation of some kind of holographic physician—like “The Doctor” on Star Trek Voyager—with Watson serving as the cognitive engine. His answer, though, reflected the deep respect he and his colleagues have for the magnitude of the challenge. “It will be when we have a technology that physicians suddenly can’t live without,” he said.

And that was it. Just a useful tool. Nothing more and nothing less.


13 I am an advisor to a company named QPID Health, started by a Harvard radiologist and informatician, that specializes in the task of mining the EHR for key data.