Dynamic interaction network inference from longitudinal microbiome data

Microbiome, Apr 2019

Several studies have focused on the microbiota living in environmental niches including human body sites. In many of these studies, researchers collect longitudinal data with the goal of understanding not only just the composition of the microbiome but also the interactions between the different taxa. However, analysis of such data is challenging and very few methods have been developed to reconstruct dynamic models from time series microbiome data. Here, we present a computational pipeline that enables the integration of data across individuals for the reconstruction of such models. Our pipeline starts by aligning the data collected for all individuals. The aligned profiles are then used to learn a dynamic Bayesian network which represents causal relationships between taxa and clinical variables. Testing our methods on three longitudinal microbiome data sets we show that our pipeline improve upon prior methods developed for this task. We also discuss the biological insights provided by the models which include several known and novel interactions. The extended CGBayesNets package is freely available under the MIT Open Source license agreement. The source code and documentation can be downloaded from https://github.com/jlugomar/longitudinal_microbiome_analysis_public . We propose a computational pipeline for analyzing longitudinal microbiome data. Our results provide evidence that microbiome alignments coupled with dynamic Bayesian networks improve predictive performance over previous methods and enhance our ability to infer biological relationships within the microbiome and between taxa and clinical factors.

Article PDF cannot be displayed. You can download it here:

https://microbiomejournal.biomedcentral.com/track/pdf/10.1186/s40168-019-0660-3

Dynamic interaction network inference from longitudinal microbiome data

(2019) 7:54 Lugo-Martinez et al. Microbiome https://doi.org/10.1186/s40168-019-0660-3 M ET HO DO LO GY Open Access Dynamic interaction network inference from longitudinal microbiome data Jose Lugo-Martinez1† , Daniel Ruiz-Perez2† , Giri Narasimhan2,3* and Ziv Bar-Joseph1* Abstract Background: Several studies have focused on the microbiota living in environmental niches including human body sites. In many of these studies, researchers collect longitudinal data with the goal of understanding not only just the composition of the microbiome but also the interactions between the different taxa. However, analysis of such data is challenging and very few methods have been developed to reconstruct dynamic models from time series microbiome data. Results: Here, we present a computational pipeline that enables the integration of data across individuals for the reconstruction of such models. Our pipeline starts by aligning the data collected for all individuals. The aligned profiles are then used to learn a dynamic Bayesian network which represents causal relationships between taxa and clinical variables. Testing our methods on three longitudinal microbiome data sets we show that our pipeline improve upon prior methods developed for this task. We also discuss the biological insights provided by the models which include several known and novel interactions. The extended CGBayesNets package is freely available under the MIT Open Source license agreement. The source code and documentation can be downloaded from https://github.com/ jlugomar/longitudinal_microbiome_analysis_public. Conclusions: We propose a computational pipeline for analyzing longitudinal microbiome data. Our results provide evidence that microbiome alignments coupled with dynamic Bayesian networks improve predictive performance over previous methods and enhance our ability to infer biological relationships within the microbiome and between taxa and clinical factors. Keywords: Dynamic interaction network inference, Longitudinal microbiome analysis, Microbial composition prediction, Dynamic Bayesian networks, Temporal alignment Background Multiple efforts have attempted to study the microbiota living in environmental niches including human body sites. These microbial communities can play beneficial as well as harmful roles in their hosts and environments. For instance, microbes living in the human gut perform numerous vital functions for homeostasis ranging from harvesting essential nutrients to regulating and maintaining the immune system. Alternatively, a compositional imbalance known as dysbiosis can lead to a wide range *Correspondence: ;; † Jose Lugo-Martinez and Daniel Ruiz-Perez contributed equally to this work. 1 Computational Biology Department, School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh 15213, Pennsylvania, USA 2 Bioinformatics Research Group (BioRG), Florida International University, 11200 SW 8th Street, Miami 33199, Florida, USA Full list of author information is available at the end of the article of human diseases [1], and is linked to environmental problems such as harmful algal blooms [2]. While many studies profile several different types of microbial taxa, it is not easy in most cases to uncover the complex interactions within the microbiome and between taxa and clinical factors (e.g., gender, age, ethnicity). Microbiomes are inherently dynamic, thus, in order to fully reconstruct these interactions, we need to obtain and analyze longitudinal data [3]. Examples include characterizing temporal variation of the gut microbial communities from pre-term infants during the first weeks of life, and understanding responses of the vaginal microbiota to biological events such as menses. Even when such longitudinal data is collected, the ability to extract an accurate set of interactions from the data is still a major challenge. © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Lugo-Martinez et al. Microbiome (2019) 7:54 To address this challenge, we need computational time-series tools that can handle data sets that may exhibit missing or noisy data and non-uniform sampling. Furthermore, a critical issue which naturally arises when dealing with longitudinal biological data is that of temporal rate variations. Given longitudinal samples from different individuals (for example, gut microbiome), we cannot expect that the rates in which interactions take place is exactly the same between these individuals. Issues including age, gender, external exposure, etc. may lead to faster or slower rates of change between individuals. Thus, to analyze longitudinal data across individuals, we need to first align the microbial data. Using the aligned profiles, we can next employ other methods to construct a model for the process being studied. Most current approaches for analyzing longitudinal microbiome data focus on changes in outcomes over time [4, 5]. The main drawback of this approach is that individual microbiome entities are treated as independent outcomes, hence, potential relationships between these entities are ignored. An alternative approach involves the use of dynamical systems such as the generalized Lotka-Volterra (gLV) models [6–10]. While gLV and other dynamical systems can help in studying the stability of temporal bacterial communities, they are not well-suited for temporally sparse and non-uniform high-dimensional microbiome time series data (e.g., limited frequency and number of samples), as well as noisy data [3, 10]. Additionally, most of these methods eliminate any taxa whose relative abundance profile exhibits a zero entry (i.e., not present in a measurable amount at one or more of the measured time points. Finally, probabilistic graphical models (e.g., hidden Markov models, Kalman filters, and dynamic Bayesian networks) are machine learning tools which can effectively model dynamic processes, as well as discover causal interactions [11]. In this work, we first adapt statistical spline estimation and dynamic time-warping techniques for aligning timeseries microbial data so that they can be integrated across individuals. We use the aligned data to learn a Dynamic Bayesian Network (DBN), where nodes represent microbial taxa, clinical conditions, or demographic factors and edges represent causal relationships between these entities. We evaluate our model by using multiple data sets comprised of the mi (...truncated)


This is a preview of a remote PDF: https://microbiomejournal.biomedcentral.com/track/pdf/10.1186/s40168-019-0660-3
Article home page: https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-019-0660-3

Jose Lugo-Martinez, Daniel Ruiz-Perez, Giri Narasimhan, Ziv Bar-Joseph. Dynamic interaction network inference from longitudinal microbiome data, Microbiome, 2019, pp. 1-14, Volume 7, Issue 1, DOI: 10.1186/s40168-019-0660-3