Longitudinal Prediction of the Infant Gut Microbiome with Dynamic Bayesian Networks
www.nature.com/scientificreports
OPEN
received: 05 June 2015
accepted: 31 December 2015
Published: 08 February 2016
Longitudinal Prediction of the
Infant Gut Microbiome with
Dynamic Bayesian Networks
Michael J. McGeachie1,*, Joanne E. Sordillo1,*, Travis Gibson1, George M. Weinstock2,
Yang-Yu Liu1, Diane R. Gold1, Scott T. Weiss1 & Augusto Litonjua1
Sequencing of the 16S rRNA gene allows comprehensive assessment of bacterial community
composition from human body sites. Previously published and publicly accessible data on 58 preterm
infants in the Neonatal Intensive Care Unit who underwent frequent stool collection was used. We
constructed Dynamic Bayesian Networks from the data and analyzed predictive performance and
network characteristics. We constructed a DBN model of the infant gut microbial ecosystem, which
explicitly captured specific relationships and general trends in the data: increasing amounts of
Clostridia, residual amounts of Bacilli, and increasing amounts of Gammaproteobacteria that then
give way to Clostridia. Prediction performance of DBNs with fewer edges were overall more accurate,
although less so on harder-to-predict subjects (p = 0.045). DBNs provided quantitative likelihood
estimates for rare abruptions events. Iterative prediction was less accurate (p < 0.001), but showed
remarkable insensitivity to initial conditions and predicted convergence to a mix of Clostridia,
Gammaproteobacteria, and Bacilli. DBNs were able to identify important relationships between
microbiome taxa and predict future changes in microbiome composition from measured or synthetic
initial conditions. DBNs also provided likelihood estimates for sudden, dramatic shifts in microbiome
composition, which may be useful in guiding further analysis of those samples.
Microbiome
The microbiota living in the human gut performs a number of vital functions for homeostasis, including the
harvest of essential nutrients1,2, synthesis of vitamins3, metabolism of xenobiotics4, and the development and
maintenance of the immune system5,6. Alterations in the gut microbiome have been observed in a number of
disease states7, and may be directly connected to pathogenesis. Microbes that populate an infant’s gut after birth
serve as critical immune stimuli in the first days of life8, and could influence the composition of the “mature” gut
microbiome, with subsequent implications for the health of the human host in both early life and adulthood.
Studies on the initial colonization of the infant gut are very limited, with sparse, if any, longitudinal data1.
Early studies have proposed an initial predominance of facultative anaerobes, followed by a progression to anaerobic bacteria9,10. The “first colonizers” of the infant gut may derive from maternal sources (vaginal flora, skin
flora, and gut flora) or from environmental microbes. Microbiome studies of meconium11, amniotic fluid and
placenta12, suggest that infants encounter microbes even before birth. Microbial rDNA present in the intrauterine
environment suggests that prenatal sources may also contribute to gut colonization12.
The most comprehensive study on the progression of the infant gut microbiota thus far examined 58 preterm
infants in a neonatal intensive care unit, with repeated measurements taken every few days on all study subjects
starting within the first days of life, and ending at approximately one month of age13. In this study, La Rosa et al.
used longitudinal analysis of time series data to demonstrate that the microbiota of the infant gut was initially
dominated by Bacilli at birth, giving way to Gammaproteobacteria, then Clostridia at the end of the first month of
life. Gestational age appeared to have the greatest influence on the pace (but not the pattern) of bacterial microbiome progression, and the non-random assembly observed seems to suggest that host biology or consistent
exposure sources (i.e. infant diet, maternal flora) play a key role in infant gut population (as compared to chance
encounters with microbes in the environment).
1
Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA. 2The Jackson Laboratory for
Genomic Medicine, Farmington, CT. *These authors contributed equally to this work. Correspondence and requests
for materials should be addressed to M.J.M. (email: )
Scientific Reports | 6:20359 | DOI: 10.1038/srep20359
1
www.nature.com/scientificreports/
Although classic longitudinal analysis captures changes in outcomes over time, this standard approach has
many limitations. Often, individual taxa are treated as separate outcomes, and information on the connections
between bacteria (i.e. how one bacterial population may influence another over time) is lost. Network-based
methods are an alternative approach to longitudinal gut microbiome modeling. In general, network studies on
gut microbiota have been limited thus far; most are comprised of correlation analyses, which others have noted
have poor asymptotic prediction14. Two types of network methodologies specifically designed to capture complex
interactions and dynamic change within the microbiome over time include the generalized Lotka-Volterra model
and Dynamic Bayesian Networks.
Generalized Lotka-Volterra (GLV) model and other dynamic systems identification formalisms15 use longitudinal microbiota compositional data to identify parameters in ordinary differential equations that describe the
dynamics of microbial ecosystems16,17. For instance, GLV has been used in a study of the murine gut microbiome
to generate a network of interactions between bacterial taxa18; this network included almost all possible edges and
was not used for prediction. In another microbiome study, a continuous GLV model was assumed and coefficients
related to the individual microbe growth rates, the strengths of the microbe-microbe interactions, and susceptibility to antibiotics were learned using linear regression with regularization17. Discrete GLV, where coefficients
were learned using a sparse linear regression technique, has also been employed16. Alternatively, other dynamic
systems models have been used, in one case modeling two of possibly many interacting microbes in the gut19.
These endeavors build upon a rich history of systems-identification literature, spanning the theoretical and practical20, and these approaches have shown that data-derived models of microbiota dynamics can have significant
analytic and predictive power17. However, the degree to which the microbiome datasets available meet the rigorous requirements of exact parameter estimation in these models remains an outstanding question16. Methods
that include explicit parameter estimation and allowances for noisy data may be more appropriate. One such
method used Bayesian statistics to help inform dynamic models of a single independent bacteria taxon’s change
in response to antibiotics21.
Bayesian networks (BNs) are an appropriate tool to model the interaction of many microbial taxa (...truncated)