Deep feed-forward networks

In Chapter 8, Healthcare Predictive Models – A Review, we discussed a study by Duke University that predicted hospital readmissions using a variety of algorithms (Futoma et al., 2015). In the second part of the study, they used a deep feed-forward network (similar to the one schematically depicted previously) that had three hidden layers, each consisting of 200-750 neurons depending on the specific condition. As their nonlinear activation function, they used the sigmoid function for almost all of the layers except for the output layer, where they used the softmax function (a common practice in deep learning). They made a model for each of the five CMS incentivized conditions (pneumonia, congestive heart failure, chronic obstructive pulmonary disease, myocardial infarction, and total hip/knee arthroplasty) and pretrained each network with all of the observations pertaining to each disease (recall that there were approximately 3,000,000 observations in total).

They used a number of other tricks to achieve maximal performance, including ridge regression penalization, dropout, and early stopping (see Goodfellow et al., 2016, for an explanation of these techniques). They achieved better results than the best non-deep learning algorithm for all of the five conditions (although the improvement was significant for only three of the five diseases). Their reported AUC is better than the previously reported AUC for the LACE and HOSPITAL readmission scores. Therefore, this is an example that demonstrates the promise of deep learning for solving complex prediction problems in healthcare, although it also highlights the difficulty and complexity with which such models are trained.