Longitudinal Prediction of the Infant Gut Microbiome with Dynamic Bayesian Networks

Scientific Reports, Feb 2016

Sequencing of the 16S rRNA gene allows comprehensive assessment of bacterial community composition from human body sites. Previously published and publicly accessible data on 58 preterm infants in the Neonatal Intensive Care Unit who underwent frequent stool collection was used. We constructed Dynamic Bayesian Networks from the data and analyzed predictive performance and network characteristics. We constructed a DBN model of the infant gut microbial ecosystem, which explicitly captured specific relationships and general trends in the data: increasing amounts of Clostridia, residual amounts of Bacilli, and increasing amounts of Gammaproteobacteria that then give way to Clostridia. Prediction performance of DBNs with fewer edges were overall more accurate, although less so on harder-to-predict subjects (p = 0.045). DBNs provided quantitative likelihood estimates for rare abruptions events. Iterative prediction was less accurate (p < 0.001), but showed remarkable insensitivity to initial conditions and predicted convergence to a mix of Clostridia, Gammaproteobacteria, and Bacilli. DBNs were able to identify important relationships between microbiome taxa and predict future changes in microbiome composition from measured or synthetic initial conditions. DBNs also provided likelihood estimates for sudden, dramatic shifts in microbiome composition, which may be useful in guiding further analysis of those samples.

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/srep20359.pdf

Longitudinal Prediction of the Infant Gut Microbiome with Dynamic Bayesian Networks

www.nature.com/scientificreports OPEN received: 05 June 2015 accepted: 31 December 2015 Published: 08 February 2016 Longitudinal Prediction of the Infant Gut Microbiome with Dynamic Bayesian Networks Michael J. McGeachie1,*, Joanne E. Sordillo1,*, Travis Gibson1, George M. Weinstock2, Yang-Yu Liu1, Diane R. Gold1, Scott T. Weiss1 & Augusto Litonjua1 Sequencing of the 16S rRNA gene allows comprehensive assessment of bacterial community composition from human body sites. Previously published and publicly accessible data on 58 preterm infants in the Neonatal Intensive Care Unit who underwent frequent stool collection was used. We constructed Dynamic Bayesian Networks from the data and analyzed predictive performance and network characteristics. We constructed a DBN model of the infant gut microbial ecosystem, which explicitly captured specific relationships and general trends in the data: increasing amounts of Clostridia, residual amounts of Bacilli, and increasing amounts of Gammaproteobacteria that then give way to Clostridia. Prediction performance of DBNs with fewer edges were overall more accurate, although less so on harder-to-predict subjects (p = 0.045). DBNs provided quantitative likelihood estimates for rare abruptions events. Iterative prediction was less accurate (p < 0.001), but showed remarkable insensitivity to initial conditions and predicted convergence to a mix of Clostridia, Gammaproteobacteria, and Bacilli. DBNs were able to identify important relationships between microbiome taxa and predict future changes in microbiome composition from measured or synthetic initial conditions. DBNs also provided likelihood estimates for sudden, dramatic shifts in microbiome composition, which may be useful in guiding further analysis of those samples. Microbiome The microbiota living in the human gut performs a number of vital functions for homeostasis, including the harvest of essential nutrients1,2, synthesis of vitamins3, metabolism of xenobiotics4, and the development and maintenance of the immune system5,6. Alterations in the gut microbiome have been observed in a number of disease states7, and may be directly connected to pathogenesis. Microbes that populate an infant’s gut after birth serve as critical immune stimuli in the first days of life8, and could influence the composition of the “mature” gut microbiome, with subsequent implications for the health of the human host in both early life and adulthood. Studies on the initial colonization of the infant gut are very limited, with sparse, if any, longitudinal data1. Early studies have proposed an initial predominance of facultative anaerobes, followed by a progression to anaerobic bacteria9,10. The “first colonizers” of the infant gut may derive from maternal sources (vaginal flora, skin flora, and gut flora) or from environmental microbes. Microbiome studies of meconium11, amniotic fluid and placenta12, suggest that infants encounter microbes even before birth. Microbial rDNA present in the intrauterine environment suggests that prenatal sources may also contribute to gut colonization12. The most comprehensive study on the progression of the infant gut microbiota thus far examined 58 preterm infants in a neonatal intensive care unit, with repeated measurements taken every few days on all study subjects starting within the first days of life, and ending at approximately one month of age13. In this study, La Rosa et al. used longitudinal analysis of time series data to demonstrate that the microbiota of the infant gut was initially dominated by Bacilli at birth, giving way to Gammaproteobacteria, then Clostridia at the end of the first month of life. Gestational age appeared to have the greatest influence on the pace (but not the pattern) of bacterial microbiome progression, and the non-random assembly observed seems to suggest that host biology or consistent exposure sources (i.e. infant diet, maternal flora) play a key role in infant gut population (as compared to chance encounters with microbes in the environment). 1 Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA. 2The Jackson Laboratory for Genomic Medicine, Farmington, CT. *These authors contributed equally to this work. Correspondence and requests for materials should be addressed to M.J.M. (email: ) Scientific Reports | 6:20359 | DOI: 10.1038/srep20359 1 www.nature.com/scientificreports/ Although classic longitudinal analysis captures changes in outcomes over time, this standard approach has many limitations. Often, individual taxa are treated as separate outcomes, and information on the connections between bacteria (i.e. how one bacterial population may influence another over time) is lost. Network-based methods are an alternative approach to longitudinal gut microbiome modeling. In general, network studies on gut microbiota have been limited thus far; most are comprised of correlation analyses, which others have noted have poor asymptotic prediction14. Two types of network methodologies specifically designed to capture complex interactions and dynamic change within the microbiome over time include the generalized Lotka-Volterra model and Dynamic Bayesian Networks. Generalized Lotka-Volterra (GLV) model and other dynamic systems identification formalisms15 use longitudinal microbiota compositional data to identify parameters in ordinary differential equations that describe the dynamics of microbial ecosystems16,17. For instance, GLV has been used in a study of the murine gut microbiome to generate a network of interactions between bacterial taxa18; this network included almost all possible edges and was not used for prediction. In another microbiome study, a continuous GLV model was assumed and coefficients related to the individual microbe growth rates, the strengths of the microbe-microbe interactions, and susceptibility to antibiotics were learned using linear regression with regularization17. Discrete GLV, where coefficients were learned using a sparse linear regression technique, has also been employed16. Alternatively, other dynamic systems models have been used, in one case modeling two of possibly many interacting microbes in the gut19. These endeavors build upon a rich history of systems-identification literature, spanning the theoretical and practical20, and these approaches have shown that data-derived models of microbiota dynamics can have significant analytic and predictive power17. However, the degree to which the microbiome datasets available meet the rigorous requirements of exact parameter estimation in these models remains an outstanding question16. Methods that include explicit parameter estimation and allowances for noisy data may be more appropriate. One such method used Bayesian statistics to help inform dynamic models of a single independent bacteria taxon’s change in response to antibiotics21. Bayesian networks (BNs) are an appropriate tool to model the interaction of many microbial taxa (...truncated)


This is a preview of a remote PDF: https://www.nature.com/articles/srep20359.pdf
Article home page: https://www.nature.com/articles/srep20359

Michael J. McGeachie, Joanne E. Sordillo, Travis Gibson, George M. Weinstock, Yang-Yu Liu, Diane R. Gold, Scott T. Weiss, Augusto Litonjua. Longitudinal Prediction of the Infant Gut Microbiome with Dynamic Bayesian Networks, Scientific Reports, 2016, Issue: 6, DOI: 10.1038/srep20359