Machine learning to predict risk for community-onset Staphylococcus aureus infections in children living in southeastern United States
PLOS ONE
RESEARCH ARTICLE
Machine learning to predict risk for
community-onset Staphylococcus aureus
infections in children living in southeastern
United States
Xiting Lin ID1, Ruijin Geng1¤a, Kurt Menke2, Mike Edelson3¤b, Fengxia Yan4, Traci Leong5,
George S. Rust6, Lance A. Waller5, Erica L. Johnson1, Lilly Cheng Immergluck ID1*
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
1 Morehouse School of Medicine, Department of Microbiology/Biochemistry/Immunology and Clinical
Research Center, Atlanta, Georgia, United States of America, 2 Septima, Copenhagen, Denmark,
3 InterDev, Roswell, Georgia, United States of America, 4 Morehouse School of Medicine, Department of
Community Health and Preventive Medicine, Atlanta, Georgia, United States of America, 5 Emory University,
Rollins School of Public Health, Department of Biostatistics & Bioinformatics, Atlanta, Georgia, United States
of America, 6 College of Medicine, and Center for Medicine and Public Health, Florida State University,
Tallahassee, Florida, United States of America
¤a Current address: Sinovac Biotech, Ltd, Beijing, China
¤b Current address: Axim Geospatial, Sun Prairie, Wisconsin, United States of America
*
OPEN ACCESS
Citation: Lin X, Geng R, Menke K, Edelson M, Yan
F, Leong T, et al. (2023) Machine learning to
predict risk for community-onset Staphylococcus
aureus infections in children living in southeastern
United States. PLoS ONE 18(9): e0290375. https://
doi.org/10.1371/journal.pone.0290375
Editor: Eili Y. Klein, Johns Hopkins University,
UNITED STATES
Received: October 1, 2022
Accepted: August 7, 2023
Published: September 1, 2023
Copyright: © 2023 Lin et al. This is an open access
article distributed under the terms of the Creative
Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in
any medium, provided the original author and
source are credited.
Data Availability Statement: All relevant data for
this study are available within the paper and its
Supporting information files.
Funding: LCI receives funding from the following
sources: National Library Medicine, Grant
#1G08LM013190-01; K-08 AHRQ-Mentored
Clinical Scientist Career Research Development
Award- HS024338-01. LCI, RG supported by the
National Center for Advancing Translational
Sciences of the National Institutes of Health under
Award Number UL1TR002378. LAW receives
Abstract
Staphylococcus aureus (S. aureus) is known to cause human infections and since the late
1990s, community-onset antibiotic resistant infections (methicillin resistant S. aureus
(MRSA)) continue to cause significant infections in the United States. Skin and soft tissue
infections (SSTIs) still account for the majority of these in the outpatient setting. Machine
learning can predict the location-based risks for community-level S. aureus infections.
Multi-year (2002–2016) electronic health records of children <19 years old with S. aureus
infections were queried for patient level data for demographic, clinical, and laboratory
information. Area level data (Block group) was abstracted from U.S. Census data. A
machine learning ecological niche model, maximum entropy (MaxEnt), was applied to
assess model performance of specific place-based factors (determined a priori) associated with S. aureus infections; analyses were structured to compare methicillin resistant
(MRSA) against methicillin sensitive S. aureus (MSSA) infections. Differences in rates of
MRSA and MSSA infections were determined by comparing those which occurred in the
early phase (2002–2005) and those in the later phase (2006–2016). Multi-level modeling
was applied to identify risks factors for S. aureus infections. Among 16,124 unique patients
with community-onset MRSA and MSSA, majority occurred in the most densely populated
neighborhoods of Atlanta’s metropolitan area. MaxEnt model performance showed the
training AUC ranged from 0.771 to 0.824, while the testing AUC ranged from 0.769 to
0.839. Population density was the area variable which contributed the most in predicting
S. aureus disease (stratified by CO-MRSA and CO-MSSA) across early and late periods.
Race contributed more to CO-MRSA prediction models during the early and late periods
than for CO-MSSA. Machine learning accurately predicts which densely populated areas
PLOS ONE | https://doi.org/10.1371/journal.pone.0290375 September 1, 2023
1 / 20
PLOS ONE
funding from the following sources: LW receives
funding from NIH/NICHHD grant R01HD092580,
NIH/NIDA T32DA050552,NIH/NIAID R01AI149527,
U01AI148069, UG3AI176853,NIH/NIEHS
R01ES033530, P30ES019776,NIH/NCI
R01CA266572,NIH/NIMHD R21MD017943,CDC
6R49CE003072, 23IPA2312301. The content is
solely the responsibility of the authors and does
not necessarily represent the official views of the
National Institutes of Health. The funders had no
role in study design, data collection and analysis,
decision to publish, or preparation of the
manuscript.
Competing interests: The authors have declared
that no competing interests exist.
Pediatric CO-S. aureus and machine learning
are at highest and lowest risk for community-onset S. aureus infections over a 14-year
time span.
Introduction
Staphylococcus aureus (S. aureus) is a bacterium that is a part of normal human flora and also
is a source of human infection. Approximately 30–40% of humans can be asymptomatic ‘carriers’ of S. aureus [1], and from the late 1990s until recently, community-onset antibiotic resistant S. aureus infections, also known as methicillin resistant S. aureus (CO-MRSA), have
increased dramatically in causing both non-invasive and invasive infections [2].
Infections due to community-onset S. aureus (CO-S. aureus) appear to be increasing
nationally and globally [3]. Skin and soft tissue infections (SSTIs) account for most community-onset infections due to both CO-MRSA and community-onset methicillin sensitive S.
aureus (CO-MSSA) [4, 5]. Moreover, over the last decade, while community-onset SSTIs continue to occur at high rates, the etiology has proportionately shifted more to CO-MSSA than
CO-MRSA [6, 7]. Risk factors for these community-onset infections include densely populated
areas [8, 9] and populations which are socioeconomically disadvantaged [10]. Race and ethnic
disparities exist for community-onset S. aureus infections, and risks associated with pediatricrelated infections include daycare attendance, prior antibiotic use, family history of SSTIs, and
public health insurance [10, 11].
However, the relationship between specific geographic location and risks tied to location
for S. aureus infections has not been well characterized. Although several studies have explored
socio-ecological risk factors for CO-MRSA [8, 12], the place-based associations between
patients with CO-MRSA infections and identified risks have not been elucidated [9]. Moreover, the location-based associations tied to risk for staphylococcal infections at the community level among children have only been recentl (...truncated)