influx_s: increasing numerical stability and precision for metabolic flux analysis in isotope labelling experiments
Copyedited by: ES
MANUSCRIPT CATEGORY: ORIGINAL PAPER
BIOINFORMATICS
ORIGINAL PAPER
Systems biology
Vol. 28 no. 5 2012, pages 687–693
doi:10.1093/bioinformatics/btr716
Advance Access publication December 30, 2011
influx_s: increasing numerical stability and precision for
metabolic flux analysis in isotope labelling experiments
Serguei Sokol1,2,3 , Pierre Millard1,2,3 and Jean-Charles Portais1,2,3,∗
1 INSA,
UPS, INP, LISBP, Université de Toulouse, 135 Avenue de Rangueil, F-31077 Toulouse, 2 INRA, UMR792,
Ingénierie des Systèmes Biologiques et des Procédés, F-31400 Toulouse and 3 CNRS, UMR5504, F-31400
Toulouse, France
Associate Editor: Trey Ideker
Received on July 26, 2011; revised on December 1, 2011; accepted
on December 25, 2011
1
INTRODUCTION
Metabolic flux analysis (MFA) aims at quantifying the actual
rates of biochemical reactions occurring in living cells. In recent
decades, MFA has been increasingly used to identify novel metabolic
pathways (Fischer and Sauer, 2003, Peyraud et al., 2009), for indepth understanding of metabolism (Nicolas et al., 2007, Perrenoud
and Sauer, 2005, Sauer et al., 2004). It is extensively used in
biotechnology to improve the metabolic properties of industrially
relevant organisms (Becker et al., 2007, van Gulik et al., 2000).
More recently, MFA has been successfully integrated with other
omics tools (transcriptomics, proteomics, metabolomics, etc.) to
obtain novel biological insights through systems biology (Ishii et al.,
2007, Lemuth et al., 2008, Shimizu, 2004).
The growing interest in MFA underlines the importance of
developing reliable tools. The present contribution particularly
addresses the need for accurate and stable algorithms for solving
∗ To
whom correspondence should be addressed.
the least-squares problem that underlies the calculation of fluxes
in MFA.
In a stationary metabolic system, the biochemical reactions which
occur in a cell can be described by the following stoichiometric linear
equation:
Sv = 0
where S is m×n stoichiometric matrix, m rows and n columns
correspond to the number of metabolites and reactions, respectively,
v is the vector of all net fluxes. Each component of the vector
v expresses a net flux, i.e. the net quantity of material converted
by a particular reaction per time unit. The whole equation system
expresses the mass conservation law in the metabolic system. At
metabolic (quasi-)steady-state, the intracellular concentrations of
metabolites are kept constant.
For most metabolic systems, the stoichiometry matrix S is underdetermined, i.e. the number of equations m is lower than the number
of fluxes n. Some fluxes can be measured experimentally. This
is generally true of input and output fluxes, but is usually not
enough to allow the calculations of all fluxes in the system. The
remaining degrees of freedom, so-called free fluxes, need additional
equations to be calculated. This can be achieved using different
approaches. For example, flux balance analysis (FBA) requires
maximization of some linear cost function like biomass yield
(Edwards et al., 2001). In the approaches using isotope labelling
experiments (ILE) discussed in this article, additional relationships
between fluxes come from the measurement of the labelling patterns
(or isotopomer distributions) of selected metabolites. Currently,
these measurements can be made by mass spectrometry (mass
isotopomers) or by Nuclear magnetic resonance (NMR) (positional
isotopomers).
The MFA-ILE approach was developed in the 1950s when 14 C
radioactive isotopes were used to elucidate fragments of carbon
metabolism in rat liver (Strisower et al., 1951, Weinman et al., 1950).
Since the 1980s–1990s, a stable isotope 13 C has preferably been
used instead of the radioactive 14 C. For many years, the equations
describing the label distribution in a given metabolic network and
their solution were derived by hand (Heath, 1968). In the early
1990s, general mathematical descriptions of the labelling problem
were introduced (Schuster et al., 1992, Wiechert, 1994, Zupke and
Stephanopoulos, 1994). This generalization led to a need to solve
algebraic systems of high dimensions (often ill-conditioned) to find
the labelling state of a given metabolic network. This paved the way
for the intensive use of applied mathematics in the MFA field.
© The Author 2011. Published by Oxford University Press. All rights reserved. For Permissions, please email:
[14:25 25/2/2012 Bioinformatics-btr716.tex]
ABSTRACT
Motivation: The problem of stationary metabolic flux analysis based
on isotope labelling experiments first appeared in the early 1950s
and was basically solved in early 2000s. Several algorithms and
software packages are available for this problem. However, the
generic stochastic algorithms (simulated annealing or evolution
algorithms) currently used in these software require a lot of time
to achieve acceptable precision. For deterministic algorithms, a
common drawback is the lack of convergence stability for illconditioned systems or when started from a random point.
Results: In this article, we present a new deterministic algorithm
with significantly increased numerical stability and accuracy of flux
estimation compared with commonly used algorithms. It requires
relatively short CPU time (from several seconds to several minutes
with a standard PC architecture) to estimate fluxes in the central
carbon metabolism network of Escherichia coli.
Availability: The software package influx_s implementing this
algorithm is distributed under an OpenSource licence at http://
metasys.insa-toulouse.fr/software/influx/
Contact:
Supplementary information: Supplementary data are available at
Bioinformatics online.
687
Page: 687
687–693
Copyedited by: ES
MANUSCRIPT CATEGORY: ORIGINAL PAPER
S.Sokol et al.
2
PROBLEM FORMULATION
In this section, we use the same conventions and notations as in
Möllney et al. (1999), Wiechert et al. (1999). The free fluxes and
free scaling parameters in a given metabolic system can be estimated
using a least-squares problem that can be written as follows:
argmin T (,ω) = ||Fw ()−w||2w +||Fy (,ω)−y||2y
,ω
(1)
Here T is a cost function representing the sum of squared weighted
errors. Its arguments, a free flux vector and ω a free scale vector,
are the free parameters that are adjusted during the minimization
process. Vectors w and y are the vectors of measured fluxes and
labelling data, respectively, whereas vector functions Fw and Fy
represent the data simulations matching measured values w and y.
Matrices w and y are covariance matrices characterizing the
experimental noise in flux and labelling data, respectively. They
are often assumed to be diagonal as the noise is expected to be
uncorrelated.
The solution of (1) must satisfy linear inequality constraints
U
≥c
(2)
ω
where U is an inequality matrix which is multiplied by a compound
vector of free parameters and ω, c is a right-hand side vector.
Inequalities express s (...truncated)