Parameter inference for stochastic single-cell dynamics from lineage tree data
Kuzmanovska et al. BMC Systems Biology (2017) 11:52
DOI 10.1186/s12918-017-0425-1
METHODOLOGY ARTICLE
Open Access
Parameter inference for stochastic
single-cell dynamics from lineage tree data
Irena Kuzmanovska1 , Andreas Milias-Argeitis1,2 , Jan Mikelson1 , Christoph Zechner1,3
and Mustafa Khammash1*
Abstract
Background: With the advance of experimental techniques such as time-lapse fluorescence microscopy, the
availability of single-cell trajectory data has vastly increased, and so has the demand for computational methods
suitable for parameter inference with this type of data. Most of currently available methods treat single-cell trajectories
independently, ignoring the mother-daughter relationships and the information provided by the population
structure. However, this information is essential if a process of interest happens at cell division, or if it evolves slowly
compared to the duration of the cell cycle.
Results: In this work, we propose a Bayesian framework for parameter inference on single-cell time-lapse data from
lineage trees. Our method relies on a combination of Sequential Monte Carlo for approximating the parameter
likelihood function and Markov Chain Monte Carlo for parameter exploration. We demonstrate our inference
framework on two simple examples in which the lineage tree information is crucial: one in which the cell phenotype
can only switch at cell division and another where the cell state fluctuates slowly over timescales that extend well
beyond the cell-cycle duration.
Conclusion: There exist several examples of biological processes, such as stem cell fate decisions or epigenetically
controlled phase variation in bacteria, where the cell ancestry is expected to contain important information about the
underlying system dynamics. Parameter inference methods that discard this information are expected to perform
poorly for such type of processes. Our method provides a simple and computationally efficient way to take into
account single-cell lineage tree data for the purpose of parameter inference and serves as a starting point for the
development of more sophisticated and powerful approaches in the future.
Keywords: Parameter inference, Cell lineages, Single cell, Stochastic systems, Monte Carlo methods
Background
Biochemical processes in isogenic cells exhibit substantial heterogeneity [1, 2]. Understanding the latter demands
experimental techniques that can resolve such processes
at the single-cell level. In contrast to bulk measurements,
these techniques provide not only access to the average
behavior of intracellular dynamics, but also its variability across cells and over time. Most single-cell techniques,
however, reveal only very few components simultaneously
that are often multiple steps away from the actual quantities of interest. The dynamics of a promoter, for instance,
*Correspondence:
Department of Biosystems Science and Engineering, ETH Zurich,
Mattenstrasse 26, 4058 Basel, Switzerland
Full list of author information is available at the end of the article
1
may not be accessible directly, but only indirectly through
a fluorescent reporter that is expressed upon activation
of this promoter [3]. Statistical inference in combination
with mathematical models provide a means to reconstruct
inaccessible parameters from available measurements,
making them instrumental for studying biochemical processes based on single-cell data.
How such inference can be performed depends strongly
on the way the data has been collected: flow cytometry measurements, for instance, reveal fluorescence values
across a population but individual cells cannot be tracked
over time. Consequently, measurements at two different
time instances are considered statistically independent.
Time-lapse microscopy techniques permit tracking of
single-cell trajectories over the duration of a whole experiment [4], which in turn provides a handle also on the
© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Kuzmanovska et al. BMC Systems Biology (2017) 11:52
temporal correlation of the underlying process. This additional degree of information can dramatically improve the
inference of unknown process parameters [5].
Most existing inference approaches consider single-cell
trajectories to be statistically independent of each other
[3, 5–7]. This way, however, important information stemming from the ancestry of a cell is lost: shortly after
cell division, for example, two daughter cells are likely to
exhibit substantial correlations, which cannot be captured
by a model that assumes independence among cells. This
can yield incomplete and biased results, especially when
the time scale of the process under study is slow compared
with the cell cycle duration.
In addition, stochastic processes of interest such as epigenetically regulated phase variation in bacteria are often
driven by DNA replication just before cell division. Examples in this category are the regulation of agn43 [8, 9]
and Pap [10, 11] systems in E.coli, and the glucosyltransferase (gtr) gene cluster in Salmonella [12]. Due to
the non-reversibility of the epigenetic modifications, gene
replication (and consequently cell division) is crucial for
phase variation to happen. Cell lineage information has
to be therefore taken into account in single-cell studies of
these systems.
Until recently, there existed little work on statistical
inference using tree-based single-cell data. In [13], the
authors proposed a method for parameter inference from
single-cell trajectories based on Approximate Bayesian
Computation (ABC). Their approach is applicable to treestructured data as well, although it requires all trajectories
to have the same length and sampling resolution. In [14]
the authors proposed an observer-based method for state
and parameter estimation in stochastic chemical reaction networks, which is also able to handle lineage tree
data. However, its applicability is limited to small systems
since it requires the full probability distributions from the
solution of the chemical master equation. Another alternative was proposed in [15], which presented an inference algorithm for Hidden Markov Trees using variational
Bayesian Expectation Maximization. This class of models
is similar to the one considered here, but cannot incorporate dynamic readouts or dynamically evolving single-cell
states.
In more recent work, the authors of [16] presented a
method for inferring tra (...truncated)