Date of Award

Spring 1-1-2019

Document Type


Degree Name

Master of Science (MS)

First Advisor

Chenhao Tan

Second Advisor

Martha Palmer

Third Advisor

James Martin


The digitalization of health records has enabled the collection of large-scale valuable datasets on healthcare. However, it has led to complaints about the diminishing value of medical notes, and often contributes to the growing physician burnout. Since writing good notes can potentially improve the quality of healthcare, it is important that doctors get some machine assistance with writing notes and bridge that gap in quality. Therefore, we examine the value of medical notes compared to the structured information in electronic health records through a prediction framework. We hypothesize that 1) medical notes provide additional predictive power to structured information; 2) certain parts of medical notes are more valuable than others (for example, original vs. ``copy-pasted''). To evaluate our hypotheses, we use the task of in-hospital mortality prediction, using timeseries derived from structured information for the first 24 hours and first 48 hours of a patient's admission. We run an additional retrospective mortality prediction task where we use all of the data associated with the patient's admission. Our results show that although medical notes bring only marginal predictive value to structured information, using them together consistently improves the prediction. Surprisingly, we also find that the usage of more common English words in notes provide more value than the uncommon English words (which also includes Medical words). Our findings indicate that there is great room for further understanding and improving the value of medical notes.