A comprehensive computational model of facilitated diffusion in prokaryotes
Nicolae Radu Zabet
0
1
Boris Adryan
0
1
Associate Editor: John Quackenbush
0
Department of Genetics, University of Cambridge
, Downing Street, Cambridge CB2 3EH,
UK
1
Cambridge Systems Biology Centre, University of Cambridge
, Tennis Court Road, Cambridge CB2 1QR
Motivation: Gene activity is mediated by site-specific transcription factors (TFs). Their binding to defined regions in the genome determines the rate at which their target genes are transcribed. Results: We present a comprehensive computational model of the search process of TF for their genomic target site(s). The computational model considers: the DNA sequence, various TF species and the interaction of the individual molecules with the DNA or between themselves. We also demonstrate a systematic approach how to parametrize the system using available experimental data. Contact: Supplementary information: Supplementary data are available at Bioinformatics online. The Author(s) 2012. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
1 INTRODUCTION
Originally, it was believed that transcription factors (TFs) find their
target sites only through 3D diffusion and the association rate would
follow the Smoluchowski limit. Riggs et al. were the first to observe
that the rate at which the lac repressor locates its target site is
much faster than the rate predicted by the Smoluchowski limit and
hypothesized that a different mechanism was involved in this process
(Riggs et al., 1970).
In their seminal work, von Hippel et al. (Berg et al., 1981;
Winter et al., 1981) thoroughly investigated this process from
both a theoretical and experimental perspective and concluded that
TF molecules use the facilitated diffusion mechanism to locate
their target sites. This facilitated diffusion mechanism assumes a
combination between 3D diffusion in the cytoplasm and an 1D
random walk on the DNA. This leads to reduction of dimensionality
in the search process and, consequently, speeds up the search. In
addition, three main types of movements on the DNA were proposed:
(i) sliding, (ii) hopping and (iii) jumping (Berg et al., 1981). Sliding
and hopping are both mechanisms of 1D random walk, but the
difference between them is that during hopping the molecules lose
contact with the DNA, whereas during sliding the molecules keep
contact with the DNA. On the other hand, jumping is a mechanism
which assumes that the molecules do not only lose contact with the
DNA for a short time interval (as in the case of hopping), but they
completely release into the cytoplasm where they spend a longer
time until they bind to the DNA uncorrelated with respect to the
unbinding position.
The existence of the 1D random walk in vivo was recently
confirmed by Elf et al. (2007). The authors of that study used
fluorescent lac repressor tetrameters and visualize their movement
in a live Escherichia coli cell, confirming that the molecules spend
90% of the time bound to the DNA.
There are still missing pieces in our understanding of the
facilitated diffusion mechanism. One approach to address these
questions consists of building a computational tool able to simulate
the relevant molecules in a cell and the entire DNA sequence.
This type of approach can address several questions, e.g. how
crowding can influence the search process at genome-wide level,
in a dynamical context (Chu et al., 2009) and not as static barriers
(Li et al., 2009). In addition, one could investigate systems with
real affinity landscapes, which is not possible through analytical
tools (Berg et al., 1981).
In this article, we present a computational model for stochastic
simulation of the search process of TFs for their target sites on the
DNA. The model considers each TF molecule as an independent
object, which can move freely in the bacterial cytoplasm, but which
also can bind to the DNA and perform an 1D random walk. The
DNA molecule is modelled as a string of nucleotides, which leads
to specific affinity between a TF molecule and DNA at the position
where the molecule is bound. We also go through the literature and
systematically infer each microscopic parameter of the model from
experimentally macroscopic measurements.
Finally, we developed an implementation of the proposed model,
which is available in Zabet and Adryan (2012).
One strategy to stochastically model the TF search process for their
target sites consists of designing a hybrid system combining
agentbased modelling and stochastic simulation techniques (Gillespie,
1977). In this model, each TF molecule is represented as an agent
able to perform certain actions and the DNA molecule as a string
of the nucleotides: a, t, c or g. The model can assume reflecting
boundaries (TFs that reach the boundary can only go back), periodic
boundaries (the DNA is assumed to be in a closed loop) or
absorbing boundaries (TFs that reach the boundary will unbind from
the DNA).
In this setting, the TF molecules can be either free in the cytoplasm
or bound on the DNA at a certain position. A free TF molecule has
only one action available, namely, to bind to the DNA.
N.R.Zabet and B.Adryan
Binding event
We assume that the bacterial cytoplasm is a perfectly mixed reservoir
from where the free TF molecules bind to the DNA. The 3D diffusion
of TF molecules in the cytoplasm is not modelled explicitly, but
rather, the molecules that are free in the cytoplasm have a certain
association rate to the DNA. To simulate 3D diffusion we use the
Direct Method implementation of Gillespie Algorithm (Gillespie,
1977) which generates a statistically correct trajectory of the Master
Equation.
The rate at which a TF molecule of species x will bind to the DNA
is computed as
where kassoc is the reaction probability rate constant for species x,
x
TFfxree the number of free TF molecules of species x and the last
fraction (Acxurrent/Axmax) is the proportion of free positions where a
molecule can bind. A comprehensive list of all parameters used in
this article can be found in the Supplementary Material.
Note, that after each 1D move, the number of available positions
on the DNA for a TF to bind can change and, consequently, the
association rate needs to be updated often. An approximate system
would consider that the binding of TF molecules is affected by
occupancy, but the update is performed only when a molecule
binds/unbinds and not when any other event (sliding or hopping)
would lead to change in the number of available binding sites on the
DNA. In the Supplementary Material, we show that the difference
between this approximation and the exact system is negligible and,
thus, one can use this approximate system to increase simulation
speed.
When a molecule binds to the DNA (...truncated)