Future Trends and Principles of Success

Medical Data Science

In recent years, the medical industry has been looking at and adopting data science and predictive analytics. Doctors have traditionally had to rely on their experiences and instincts when diagnosing a condition or deciding on what the next treatment might be. The evidence-based medicine and precision-medicine movement argue that medical decisions should be based on data, ideally linking the best available data to an individual patient’s predicament and preferences. For example, in the case of precision medicine, fast genome-sequencing technology means that it is now feasible to analyze the genomes of patients with rare diseases in order to identify mutations that cause the disease so as to design and select appropriate therapies specific to that individual. Another factor driving data science in medicine is the cost of health care. Data science, in particular predictive analytics, can be used to automate some health care processes. For example, predictive analytics has been used to decide when antibiotics and other medicines should be administrated to babies and adults, and it is widely reported that many lives have been saved because of this approach.

Medical sensors worn or ingested by the patient or implanted are being developed to continuously monitor a patient’s vital signs and behaviors and how his or her organs are functioning throughout the day. These data are continuously gathered and fed back to a centralized monitoring server. It is here at the monitoring server that health care professionals access the data being generated by all the patients, assess their conditions, understand what effects the treatment is having, and compare each patient’s results to those of other patients with similar conditions to inform them regarding what should happen next in each patient’s treatment regime. Medical science is using the data generated by these sensors and integrating it with additional data from the various parts of the medical profession and the pharmaceutical industry to determine the effects of current and new medicines. Personalized treatment programs are being developed based on the type of patient, his condition, and how his body responds to various medicines. In addition, this new type of medical data science is now feeding into new research on medicines and their interactions, the design of more efficient and detailed monitoring systems, and the uncovering of greater insights from clinical trials.

Smart Cities

Various cities around the world are adopting new technology to be able to gather and use the data generated by their citizens in order to better manage the cities’ organizations, utilities, and services. There are three core enablers of this trend: data science, big data, and the Internet of Things. The name “Internet of Things” describes the internetworking of physical devices and sensors so that these devices can share information. This may sound mundane, but it has the benefit that we can now remotely control smart devices (such as our home if it is properly configured) and opens the possibility that networked machine-to-machine communication will enable smart environments to autonomously predict and react to our needs (for example, there are now commercially available smart refrigerators that can warn you when food is about to spoil and allows you to order fresh milk through your smart phone).

Smart-city projects integrate real-time data from many different data sources into a single data hub, where they are analyzed and used to inform management and planning decisions. Some smart-city projects involve building brand-new cities that are smart from the ground up. Both Masdar City in the United Arab Emirates and Songdo City in South Korea are brand-new cities that have been built with the smart technology at their core and a focus on being eco-friendly and energy efficient. However, most smart-city projects involve the retrofitting of existing cities with new sensor networks and data-processing centers. For example, in the SmartSantander project in Spain,¹ more than 12,000 networked sensors have been installed across the city to measure temperature, noise, ambient lighting, carbon monoxide levels, and parking. Smart-city projects often focus on developing energy efficiency, planning and routing traffic, and planning utility services to match population needs and growth.

Japan has embraced the smart-city concept with a particular focus on reducing energy usage. The Tokyo Electric Power Company (TEPC) has installed more than 10 million smart meters across homes in the TEPC service area.² At the same time, TEPC is developing and rolling out smart-phone applications that enable customers to track the electricity used in their homes in real time and to change their electricity contract. These smart-phone applications also enable the TEPC to send each customer personalized energy-saving advice. Outside of the home, smart-city technology can be used to reduce energy usage through intelligent street lighting. The Glasgow Future Cities Demonstrator is piloting street lighting that switches on and off depending on whether people are present. Energy efficiency is also a top priority for all new buildings, particularly for large local government and commercial buildings. These buildings’ energy efficiency can be optimized by automatically managing climate controls through a combination of sensor technology, big data, and data science. An extra benefit of these smart-building monitoring systems is that they can monitor for levels of pollution and air quality and can activate the necessary controls and warnings in real time.

Transport is another area where cities are using data science. Many cities have implemented traffic-monitoring and management systems. These systems use real-time data to control the flow of traffic through the city. For example, they can control traffic-light sequences in real time, in some cases to give priority to public-transport vehicles. Data on city transport networks are also useful for planning public transport. Cities are examining the routes, schedules, and vehicle management to ensure that services support the maximum number of people and to reduce the costs associated with delivering the transport services. In addition to modeling the public network, data science is also being used to monitor official city vehicles to ensure their optimal usage. Such projects combine traffic conditions (collected by sensors along the road network, at traffic lights, etc.), the type of task being performed, and other conditions to optimize route planning, and dynamic route adjustments are fed to the vehicles with live updates and changes to their routes.

Beyond energy usage and transport, data science is being used to improve the provision of utility services and to implement longer-term planning of infrastructure projects. The efficient provision of utility services is constantly being monitored based on current usage and projected usages, and the monitoring takes into account previous usage in similar conditions. Utility companies are using data science in a number of ways. One way is monitoring the delivery network for the utility: the supply, the quality of the supply, any network issues, areas that require higher-than-expected usage, automated rerouting of the supply, and any anomalies in the network. Another way that utility companies are using data science is in monitoring their customers. They are looking for unusual usage that might indicate some criminality (for example, a grow house), customers who may have altered the equipment and meters for the building where they live, and customers who are most likely to default on their payments. Data science is also being used in examining the best way to allocate housing and associated services in city planning. Models of population growth are built to forecast into the future, and based on various simulations the city planners can estimate when and where certain support services, such as high schools, are needed.

Data Science Project Principles: Why Projects Succeed or Fail

A data science project sometimes fails insofar as it doesn’t deliver what was hoped for because it gets bogged down in some technical or political issues, does not deliver useful results, and, more typically, is run once (or a couple of times) but never run again. Just like Leo Tolstoy’s happy families,³ the success of a data science project is dependent on a number of factors. Successful data science projects need focus, good-quality data, the right people, the willingness to experiment with multiple models, integration into the business information technology (IT) architecture and processes, buy-in from senior management, and an organization’s recognition that because the world changes, models go out of date and need to be rebuilt semiregularly. Failure in any of these areas is likely to result in a failed project. This section details the common factors that determine the success of data science projects as well as the typical reasons why data science projects fail.

Final Thoughts

Humans have always abstracted from the world and tried to understand it by identifying patterns in their experiences of it. Data science is the latest incarnation of this pattern-seeking behavior. However, although data science has a long history, the breadth of its impact on modern life is without precedent. In modern societies, the words precision, smart, targeted, and personalized are often indicative of data science projects: precision medicine, precision policing, precision agriculture, smart cities, smart transport, targeted advertising, personalized entertainment. The common factor across all these areas of human life is that decisions have to be made: What treatment should we use for this patient? Where should we allocate our policing resources? How much fertilizer should we spread? How many high schools do we need to build in the next four years? Who should we send this advertisement to? What movie or book should we recommend to this person? The power of data science to help with decision making is driving its adoption. Done well, data science can provide actionable insight that leads to better decisions and ultimately better outcomes.

Data science, in its modern guise, is driven by big data, computer power, and human ingenuity from a number of fields of scientific endeavor (from data mining and database research to machine learning). This book has tried to provide an overview of the fundamental ideas and concepts required to understand data science. The CRISP-DM project life cycle makes the data science process explicit and provides a structure for the data science journey from data to wisdom: understand the problem, prepare the data, use ML to extract patterns and create models, use the models to get actionable insight. The book also touches on some of the ethical concerns relating to individual privacy in a data science world. People have genuine and well-founded concerns that data science has the potential to be used by governments and vested interests to manipulate our behaviors and police our actions. We, as individuals, need to develop informed opinions about what type of a data world we want to live in and to think about the laws we want our societies to develop in order to steer the use of data science in appropriate directions. Despite the ethical concerns we may have around data science, the genie is already very much out of the bottle: data science is having and will continue to have significant effects on our daily lives. When used appropriately, it has the potential to improve our lives. But if we want the organizations we work with, the communities we live in, and the families we share our lives with to benefit from data science, we need to understand and explore what data science is, how it works, and what it can (and can’t) do. We hope this book has given you the essential foundations you need to go on this journey.

7 Future Trends and Principles of Success

Medical Data Science

Smart Cities

Data Science Project Principles: Why Projects Succeed or Fail

Focus

Data

People

Models

Integration with the Business

Buy-in

Iteration

Final Thoughts

Notes