Integrating proteomic or transcriptomic data into metabolic models using linear bound flux balance analysis

Bioinformatics, Nov 2018

Transcriptomics and proteomics data have been integrated into constraint-based models to influence flux predictions. However, it has been reported recently for Escherichia coli and Saccharomyces cerevisiae, that model predictions from parsimonious flux balance analysis (pFBA), which does not use expression data, are as good or better than predictions from various algorithms that integrate transcriptomics or proteomics data into constraint-based models.

Integrating proteomic or transcriptomic data into metabolic models using linear bound flux balance analysis

Bioinformatics, 34(22), 2018, 3882–3888 doi: 10.1093/bioinformatics/bty445 Advance Access Publication Date: 5 June 2018 Original Paper Systems biology Mingyuan Tian1,2 and Jennifer L. Reed1,2,* 1 Department of Chemical & Biological Engineering and 2Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53705, USA *To whom correspondence should be addressed. Associate Editor: Jonathan Wren Received on November 2, 2017; revised on April 3, 2018; editorial decision on May 28, 2018; accepted on June 1, 2018 Abstract Motivation: Transcriptomics and proteomics data have been integrated into constraint-based models to influence flux predictions. However, it has been reported recently for Escherichia coli and Saccharomyces cerevisiae, that model predictions from parsimonious flux balance analysis (pFBA), which does not use expression data, are as good or better than predictions from various algorithms that integrate transcriptomics or proteomics data into constraint-based models. Results: In this paper, we describe a novel constraint-based method called Linear Bound Flux Balance Analysis (LBFBA), which uses expression data (either transcriptomic or proteomic) to predict metabolic fluxes. The method uses expression data to place soft constraints on individual fluxes, which can be violated. Parameters in the soft constraints are first estimated from a training expression and flux dataset before being used to predict fluxes from expression data in other conditions. We applied LBFBA to E.coli and S.cerevisiae datasets and found that LBFBA predictions were more accurate than pFBA predictions, with average normalized errors roughly half of those from pFBA. For the first time, we demonstrate a computational method that integrates expression data into constraint-based models and improves quantitative flux predictions over pFBA. Availability and implementation: Code is available in the Supplementary data available at Bioinformatics online. Contact: Supplementary information: Supplementary data are available at Bioinformatics online. 1 Introduction Constraint-based modeling (CBM) can be used to predict cell physiology (e.g. growth rate and metabolic fluxes) under different conditions and improve our understanding of cell metabolism. CBM has been applied in metabolic engineering (Burgard et al., 2003; Cotten and Reed, 2013; Kim et al., 2011; Tervo and Reed, 2012), metabolic comparisons (Bosi et al., 2016; Hamilton and Reed, 2012; Nuccio and Bäumler, 2014), drug discovery (Chavali et al., 2012; Kim et al., 2010; Lee et al., 2009) and other health applications (Becker and Palsson, 2008; Magnúsdóttir et al., 2017; Shlomi et al., 2008). Recent developments in sequencing and mass spectrometry have enabled transcriptomics and proteomics datasets to become more widely available. These omics datasets can be used to derive expression-based CBM constraints and/or objective functions, which can potentially improve model predictions. There are two fundamental ways that expression data has been integrated into constraint-based models. The first way is to directly integrate the expression information into the flux bound. For example, Åkesson et al. (Åkesson et al., 2004) set the fluxes to zero if expression of their associated genes was low. E-Flux (Colijn et al., 2009) directly models the maximum allowable flux value as a function of measured gene expression. The second way is to divide C The Author(s) 2018. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: V 3882 Integrating proteomic or transcriptomic data into metabolic models using linear bound flux balance analysis Using expression data to improve flux predictions 3883 Table 1. A comparison between different constraint-based methods integrating gene expression data Åkesson E-flux GIMME iMAT tFBA MADE PROM Directly integrated gene expression into flux bound Maximized agreement or minimized violation between flux and gene expression Needs flux data to parameterize constraints Compared flux predictions to measured intracellular fluxes Number of experimental conditions used Yes No No Yes (4a) 1 Yes No No No 1 No Yes No No 1 No Yes No No 1 No Yes No No 9 No Yes No No 4 Yes No No No 907 LBFBA Yes No Yes Yes (37a) 28b a The number of fluxes that were compared. Sensitivity analysis showed that 4 or 5 conditions in the training dataset were sufficient. Note: These methods are Åkesson (Åkesson et al., 2004), E-flux (Colijn et al., 2009), GIMME (Becker and Palsson, 2008), Imat (Shlomi et al., 2008), tFBA (van Berlo et al., 2011), MADE (Jensen and Papin, 2011), PROM (Chandrasekaran and Price, 2010) and LBFBA. b involves different constraints and an objective function. Flux balance analysis (FBA) is one of the CBM methods often used to predict a flux distribution which maximizes biomass yield. pFBA uses the sum of the absolute value of the fluxes as an objective function [Equation (1)] and can be formulated as: X jvj j (1) min j2Reaction s.t. X Sij vj ¼ 0 8i 2 Metabolite LBj  vj  UBj 8j 2 Reaction (3) vj  0 8j 2 Irreversible Reaction (4) vj ¼ vlsj 8j 2 Extracellular Reaction (5) vbiomass ¼ vmeasured 2.1 Overview of pFBA CBM is a powerful tool to predict cellular phenotypes and flux distributions. The basic formulation of a constraint-based model biomass (6) Equation (2) is the steady-state mass balance constraint, meaning there is no accumulation for each metabolite in the cell. S denotes the stoichiometric matrix where Sij is the stoichiometric coefficient of metabolite i for reaction j. vj is the flux through reaction j. Equation (3) is the enzyme capacity constraint, which imposes an upper bound (UBj ) and lower bound (LBj ) for each reaction (which is typically 1000 and 1000 mmol/gDW/h, respectively). Equation (4) ensures that fluxes through irreversible reactions are nonnegative. Equation (5) fixes the extracellular flux values to the bestestimates of the extracellular fluxes (vlsj ) obtained from a least squares fit between the metabolic model and extracellular flux measurements (see Supplementary Methods for details). Equation (6) fixes the biomass flux (i.e. growth rate) to the measured value. By solving pFBA, the flux distribution under a specific condition can be predicted. 2.2 Mathematical formulation of LBFBA In LBFBA, gene or protein expression data are used to further tighten the upper and lower bounds for individual fluxes. LBFBA is formulated as the following optimization problem: X X min jvj j þ b aj (7) j2Reaction 2 Materials and methods (2) j2Reaction j2Rexp s.t. X j2Reaction Sij vj ¼ 0 8i 2 Metabolite (8) the reactions into different categories based on gene expression (e.g. highly expressed or lowly expressed) and then maximize the agreement (reactions associated with highly expressed genes have high flux) or minimize the disagreement (reactions associated with lowly expressed genes should not have hig (...truncated)


This is a preview of a remote PDF: https://academic.oup.com/bioinformatics/article-pdf/34/22/3882/48920211/bioinformatics_34_22_3882.pdf
Article home page: https://academic.oup.com/bioinformatics/article/34/22/3882/5033386

Tian, Mingyuan, Reed, Jennifer L. Integrating proteomic or transcriptomic data into metabolic models using linear bound flux balance analysis, Bioinformatics, 2018, pp. 3882-3888, Volume 34, Issue 22, DOI: 10.1093/bioinformatics/bty445