Predicting hospital admission at emergency department triage using machine learning
RESEARCH ARTICLE
Predicting hospital admission at emergency
department triage using machine learning
Woo Suk Hong1, Adrian Daniel Haimovich1, R. Andrew Taylor2*
1 Yale School of Medicine, New Haven, Connecticut, United States of America, 2 Department of Emergency
Medicine, Yale School of Medicine, New Haven, Connecticut, United States of America
*
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
Abstract
Objective
To predict hospital admission at the time of ED triage using patient history in addition to
information collected at triage.
OPEN ACCESS
Citation: Hong WS, Haimovich AD, Taylor RA
(2018) Predicting hospital admission at emergency
department triage using machine learning. PLoS
ONE 13(7): e0201016. https://doi.org/10.1371/
journal.pone.0201016
Editor: Qunfeng Dong, University of North Texas,
UNITED STATES
Received: February 12, 2018
Accepted: July 6, 2018
Published: July 20, 2018
Copyright: © 2018 Hong et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: The raw data used in
this study was derived from electronic health
records of patient visits to the Yale New Haven
Health system and is not publicly available due to
the ubiquitous presence of protected health
information (PHI). A de-identified, processed
dataset of all patient visits included in the models,
as well as scripts used for processing and analysis,
is available on the Github repository (https://github.
com/yaleemmlc/admissionprediction) (10.5281/
zenodo.1308993). All other data is available within
the paper and its Supporting Information files.
Methods
This retrospective study included all adult ED visits between March 2014 and July 2017
from one academic and two community emergency rooms that resulted in either admission
or discharge. A total of 972 variables were extracted per patient visit. Samples were randomly partitioned into training (80%), validation (10%), and test (10%) sets. We trained a
series of nine binary classifiers using logistic regression (LR), gradient boosting (XGBoost),
and deep neural networks (DNN) on three dataset types: one using only triage information,
one using only patient history, and one using the full set of variables. Next, we tested the
potential benefit of additional training samples by training models on increasing fractions of
our data. Lastly, variables of importance were identified using information gain as a metric to
create a low-dimensional model.
Results
A total of 560,486 patient visits were included in the study, with an overall admission risk of
29.7%. Models trained on triage information yielded a test AUC of 0.87 for LR (95% CI 0.86–
0.87), 0.87 for XGBoost (95% CI 0.87–0.88) and 0.87 for DNN (95% CI 0.87–0.88). Models
trained on patient history yielded an AUC of 0.86 for LR (95% CI 0.86–0.87), 0.87 for
XGBoost (95% CI 0.87–0.87) and 0.87 for DNN (95% CI 0.87–0.88). Models trained on the
full set of variables yielded an AUC of 0.91 for LR (95% CI 0.91–0.91), 0.92 for XGBoost
(95% CI 0.92–0.93) and 0.92 for DNN (95% CI 0.92–0.92). All algorithms reached maximum
performance at 50% of the training set or less. A low-dimensional XGBoost model built on
ESI level, outpatient medication counts, demographics, and hospital usage statistics yielded
an AUC of 0.91 (95% CI 0.91–0.91).
PLOS ONE | https://doi.org/10.1371/journal.pone.0201016 July 20, 2018
1 / 13
Predicting hospital admission at emergency department triage using machine learning
Funding: WH is supported by the James G. Hirsch
Endowed Medical Student Research Fellowship at
Yale University School of Medicine. AH is
supported by National Institutes of Health grants
1F30CA196191 and T32GM007205. RT received
no specific funding for this study. The funders had
no role in study design, data collection and
analysis, decision to publish, or preparation of the
manuscript.
Conclusion
Machine learning can robustly predict hospital admission using triage information and
patient history. The addition of historical information improves predictive performance significantly compared to using triage information alone, highlighting the need to incorporate
these variables into prediction models.
Competing interests: The authors have declared
that no competing interests exist.
Introduction
While most emergency department (ED) visits end in discharge, EDs represent the largest
source of hospital admissions [1]. Upon arrival to the ED, patients are first sorted by acuity in
order to prioritize individuals requiring urgent medical intervention. This sorting process,
called "triage", is typically performed by a member of the nursing staff based on the patient’s
demographics, chief complaint, and vital signs. Subsequently, the patient is seen by a medical
provider who creates the initial care plan and ultimately recommends a disposition, which this
study limits to hospital admission or discharge.
Prediction models in medicine seek to improve patient care and increase logistical efficiency [2,3]. For example, prediction models for sepsis or acute coronary syndrome are
designed to alert providers of potentially life-threatening conditions, while models for hospital
utilization or patient-flow enable resource optimization on a systems level [4–8]. Early identification of ED patients who are likely to require admission may enable better optimization of
hospital resources through improved understanding of ED patient mixtures [9]. It is increasingly understood that ED crowding is correlated with poorer patient outcomes [10]. Notification of administrators and inpatient teams regarding potential admissions may help alleviate
this problem [11]. From the perspective of patient care in the ED setting, a patient’s likelihood
of admission may serve as a proxy for acuity, which is used in a number of downstream decisions such as bed placement and the need for emergency intervention [12–14].
Numerous prior studies have sought to predict hospital admission at the time of ED triage.
Most models only include information collected at triage such as demographics, vital signs,
chief complaint, nursing notes, and early diagnostics [11,14–19], while some models include
additional features such as hospital usage statistics and past medical history [9,12,20,21]. A few
models built on triage information have been formalized into clinical decision rules such as
the Sydney Triage to Admission Risk Tool and the Glasgow Admission Prediction Score [22–
25]. Notably, a progressive modeling approach that uses information available at later timepoints, such as lab tests ordered, medications given, and diagnoses entered by the ED provider
during the patient’s current visit, has been able to achieve high predictive power and indicates
the utility of these features [20,21]. We hypothesized that extracting such feature (...truncated)