Reconstructing Generalized Logical Networks of Transcriptional Regulation in Mouse Brain from Temporal Gene Expression Data
Hindawi Publishing Corporation
EURASIP Journal on Bioinformatics and Systems Biology
Volume 2009, Article ID 545176, 13 pages
doi:10.1155/2009/545176
Research Article
Reconstructing Generalized Logical Networks of Transcriptional
Regulation in Mouse Brain from Temporal Gene Expression Data
Mingzhou (Joe) Song,1 Chris K. Lewis,1 Eric R. Lance,1 Elissa J. Chesler,2
Roumyana Kirova Yordanova,3 Michael A. Langston,4 Kerrie H. Lodowski,5
and Susan E. Bergeson6
1 Department of Computer Science, New Mexico State University, Las Cruces, NM 88003, USA
2 Systems Genetics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
3 Department of Applied Genomics, Bristol-Myers Squibb R&D, P.O. Box 5400, Princeton, NJ 08543, USA
4 Department of Computer Science, University of Tennessee, Knoxville, TN 37996, USA
5 Department of Pharmacology, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
6 Department of Pharmacology and Neuroscience, Texas Tech University, Lubbock, TX 79430, USA
Correspondence should be addressed to Mingzhou (Joe) Song,
Received 1 June 2008; Revised 8 September 2008; Accepted 12 December 2008
Recommended by Dirk Repsilber
Gene expression time course data can be used not only to detect differentially expressed genes but also to find temporal associations
among genes. The problem of reconstructing generalized logical networks to account for temporal dependencies among genes
and environmental stimuli from transcriptomic data is addressed. A network reconstruction algorithm was developed that uses
statistical significance as a criterion for network selection to avoid false-positive interactions arising from pure chance. The
multinomial hypothesis testing-based network reconstruction allows for explicit specification of the false-positive rate, unique
from all extant network inference algorithms. The method is superior to dynamic Bayesian network modeling in a simulation study.
Temporal gene expression data from the brains of alcohol-treated mice in an analysis of the molecular response to alcohol are used
for modeling. Genes from major neuronal pathways are identified as putative components of the alcohol response mechanism.
Nine of these genes have associations with alcohol reported in literature. Several other potentially relevant genes, compatible with
independent results from literature mining, may play a role in the response to alcohol. Additional, previously unknown gene
interactions were discovered that, subject to biological verification, may offer new clues in the search for the elusive molecular
mechanisms of alcoholism.
Copyright © 2009 Mingzhou (Joe) Song et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
1. Introduction
The regulation of transcription occurring in an intriguingly
complex biological system involves multiple interacting
regulatory processes in gene regulatory networks (GRNs).
Modeling transcriptional regulation requires algorithms
that retain information about regulatory interactions. The
generalized logical network (GLN) is a generative model
that can be reconstructed from temporal trajectories, for
example, from data collected in time-series studies of gene
expression. Because these data capture information on
temporal antecedence, the approach can be used to develop
stronger hypotheses about casual relations among transcrip-
tional events than one would be able to derive from mere
correlation analyses. We designed a GLN reconstruction
algorithm that differs from previous approaches because
it makes use of hypothesis testing on the multinomial
distribution to establish directed connections among genes.
Our statistical approach allows explicit control of false
positives by specifying a desirable alpha level, while other
criteria used in network reconstruction, such as the Bayesian
information criterion (BIC) used in dynamic Bayesian
networks (DBNs) reconstruction and the coefficient of
determination (COD) used in Boolean networks (BNs)
reconstruction, do not explicitly enforce false-positive rate
control.
2
GLNs also allow more aspects of systems to be studied
than other network models by enabling (1) adaptive description for interactions among variables, (2) nonlinear interaction patterns, and (3) finite steady states, attractor basins,
and state transition diagrams. The software CellNetAnalyzer
[1] allows a user to draft a GLN from existing knowledge.
Our method allows such networks to be reconstructed and
derived solely from data-driven approaches. GLNs have
the further advantage that they do not require parametric
assumptions, unlike stochastic logical networks [2] which
discretize differential equations based on strong assumptions. Additionally, our implementation of GLN modeling
focuses on network reconstruction from temporal gene
expression data, which can be used complementarily with
network property analysis algorithms such as the network
walking algorithm [3], and literature mining tools such as
those reviewed in [4].
GLN is a dynamical system model to characterize
interactions among discrete variables over discrete time. It
is a directed graph, with nodes representing the discrete
variables and each having a generalized truth table (gtt). The
gtt for a node X maps all possible combinations of parent
node values to values of X. Related modeling paradigms with
different emphases have also been applied to biological data
and are compared and contrasted with the GLN below.
(i) Temporal probabilistic networks. The dynamic
Bayesian network (DBN) is an extension of Bayesian networks, which incorporates time transitions between Bayesian
networks. A DBN describes temporal statistical dependencies
among genes. DBNs have been successful in extracting
probabilistic dependencies among genes in GRNs [5–7].
Certain DBNs can even be converted to probabilistic Boolean
networks [8]. However, DBN is an indirect tool to understand system dynamics since it does not explicitly describe
temporal relations among entities in a functional form, while
a GLN provides immediate functional relationships among
variables.
(ii) Continuous dynamical system models. Differential
equations in both deterministic [9, 10] and stochastic [11]
formulations have been used to model interactions in GRNs
in continuous time. The E-Cell Project [12, 13] uses differential equations to target knowledge-based reproduction, not
data-driven reconstruction, of intracellular biochemical and
molecular interactions within a single cell. The stochastic
master equations relate state probabilities by differential
equations, impractical for biological systems involving many
variables because of the computational burden. Recent
research has been focusing on improving the scalability of
such models [14].
(iii) Discrete dynamical system models. The Boolean
network (BN) [ (...truncated)