Exponential surge in health care data, such as longitudinal data from electronic health records (EHR), sensor data from intensive care unit (ICU), etc., is providing new opportunities to discover meaningful data-driven characteristics and patterns of diseases. Recently, deep learning models have been employed for many computational phenotyping and healthcare prediction tasks to achieve state-of-the-art performance. However, deep models lack interpretability which is crucial for wide adoption in medical research and clinical decision-making. In this paper, we introduce a simple yet powerful knowledge-distillation approach called interpretable mimic learning, which uses gradient boosting trees to learn interpretable models and at the same time achieves strong prediction performance as deep learning models. Experiment results on Pediatric ICU dataset for acute lung injury (ALI) show that our proposed method not only outperforms state-of-the-art approaches for morality and ventilator free days prediction tasks but can also provide interpretable models to clinicians.
Interpretable Deep Models for ICU Outcome Prediction. Zhengping Che, Sanjay Purushotham, Robinder Khemani, and Yan Liu. Proceedings of the American Medical Informatics Assocation Annual Symposium (AMIA), 2016. Distilling Knowledge from Deep Networks with Applications to Computational Phenotyping. Zhengping Che, Sanjay Purushotham, and Yan Liu. Workshop on Data Science, Learning and Applications to Biomedical and Health Sciences (DSLA-BHS), 2016. Distilling Knowledge from Deep Networks with Applications to Healthcare Domain. Zhengping Che, Sanjay Purushotham, and Yan Liu. NIPS Workshop on Machine Learning for Healthcare (NIPS-MLHC), 2015.
Deep Computational Phenotyping
The rapid growth of digital health databases has attracted many researchers interested in using modern computational methods to discover and model patterns of health and illness in a research program known as computational phenotyping. We apply deep learning to the problem of discovery and detection of characteristic patterns of physiology in clinical time series data. We propose two novel modifications to standard neural net training that address challenges and exploit properties that are peculiar, if not exclusive, to medical data. First, we examine a general framework for using prior knowledge to regularize parameters in the topmost layers. This framework can leverage priors of any form, ranging from formal ontologies (e.g., ICD9 codes) to data-derived similarity. Second, we describe a scalable procedure for training a collection of neural networks of different sizes but with partially shared architectures. Both of these innovations are well-suited to medical applications, where available data are not yet Internet scale and have many sparse outputs (e.g., rare diagnoses) but which have exploitable structure (e.g., temporal order and relationships between labels). However, both techniques are sufficiently general to be applied to other problems and domains. We demonstrate the empirical efficacy of both techniques on two real-world hospital data sets and show that the resulting neural nets learn interpretable and clinically relevant features.
Deep Computational Phenotyping. Zhengping Che*, David Kale*, Wenzhe Li, Mohammad Taha Bahadori, and Yan Liu. Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), 2015. (*contributed equally) Causal Phenotype Discovery via Deep Networks. David C. Kale, Zhengping Che, Mohammad Taha Bahadori, Wenzhe Li, and Yan Liu. Proceedings of the American Medical Informatics Assocation Annual Symposium (AMIA), 2015.
Deep Models for EHRs with Missing Values
Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. We develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provides useful insights for better understanding and utilization of missing values in time series analysis.
Recurrent Neural Networks for Multivariate Time Series with Missing Values. Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. arXiv preprint arXiv:1606.01865.
Variational Adversarial Deep Domain Adaptation
We study the problem of learning domain invariant representations for time series data while transferring the complex temporal latent dependencies between domains. Our model termed as Variational Recurrent Adversarial Deep Domain Adaptation (VRADA) is built atop a variational recurrent neural network (VRNN) and trains adversarially to capture complex temporal relationships that are domain-invariant. This is (as far as we know) the first to capture and transfer temporal latent dependencies of multivariate time-series data. Through experiments on real-world multivariate healthcare time-series datasets, we empirically demonstrate that learning temporal dependencies helps our model’s ability to create domain-invariant representations, allowing our model to outperform current state-of-the-art deep domain adaptation approaches.
Variational Adversarial Deep Domain Adaptation for Healthcare Time Series Analysis. Sanjay Purushotham, Wilka Carvalho and Yan Liu. NIPS Machine Learning for Healthcare Workshop (NIPS-ML4HC), 2016. Variational Adversarial Deep Domain Adaptation for Healthcare Time Series Analysis. Sanjay Purushotham, Wilka Carvalho and Yan Liu. SoCal Machine Learning Symposium (SCMLS), 2016.
Due to the sheer amount of data being generated, most of medical time series lack annotations or gold standard labels. Whatever annotations that are available are also possibly subjective and prone to human error. We propose two new techniques for the analysis of medical time series data in the face of these challenges. Our motivating application, pediatric ventilator management, has all the characteristics listed above. The first technique is an anomaly detection technique designed for a realtime patient monitoring application. The components used in this technique are robust and not computationally-intensive. We also demonstrate how to discover bias present in the dataset. The second technique consists of a descriptor for waveform time series and its corresponding dissimilarity measure. They can be used with most existing clustering algorithms for the purpose of anomaly discovery and categorization.
Real-time Detection and Exploratory Discovery of Anomalies for Pediatric Ventilator Management. Tanachat Nilanon, Yan Liu, Justin Hotz and Robinder Khemani, Proceedings of Machine Learning in Health Care (MLHC), 2016. Normal / Abnormal Heart Sound Recordings Classification Using Deep Recurrent Neural Network. Tanachat Nilanon, Sanjay Purushotham and Yan Liu. Proceedings of the Computing in Cardiology (CinC), 2016.