Exploiting combinatorial cultivation conditions to infer transcriptional regulation
Theo A Knijnenburg
2
Johannes H de Winde
1
Jean-Marc Daran
1
Pascale Daran-Lapujade
1
Jack T Pronk
1
Marcel JT Reinders
2
Lodewyk FA Wessels
0
2
0
Department of Molecular Biology, The Netherlands Cancer Institute
,
Plesmanlaan 121, 1066 CX Amsterdam
,
The Netherlands
1
Industrial Microbiology, Department of Biotechnology, Delft University of Technology
,
Julianalaan 67, 2628 BC Delft
,
The Netherlands
2
Information and Communication Theory Group, Faculty of Electrical Engineering
,
Mathematics and Computer Science
,
Delft University of Technology
,
Mekelweg 4, 2628 CD Delft
,
The Netherlands
Background: Regulatory networks often employ the model that attributes changes in gene expression levels, as observed across different cellular conditions, to changes in the activity of transcription factors (TFs). Although the actual conditions that trigger a change in TF activity should form an integral part of the generated regulatory network, they are usually lacking. This is due to the fact that the large heterogeneity in the employed conditions and the continuous changes in environmental parameters in the often used shake-flask cultures, prevent the unambiguous modeling of the cultivation conditions within the computational framework. Results: We designed an experimental setup that allows us to explicitly model the cultivation conditions and use these to infer the activity of TFs. The yeast Saccharomyces cerevisiae was cultivated under four different nutrient limitations in both aerobic and anaerobic chemostat cultures. In the chemostats, environmental and growth parameters are accurately controlled. Consequently, the measured transcriptional response can be directly correlated with changes in the limited nutrient or oxygen concentration. We devised a tailor-made computational approach that exploits the systematic setup of the cultivation conditions in order to identify the individual and combined effects of nutrient limitations and oxygen availability on expression behavior and TF activity. Conclusion: Incorporating the actual growth conditions when inferring regulatory relationships provides detailed insight in the functionality of the TFs that are triggered by changes in the employed cultivation conditions. For example, our results confirm the established role of TF Hap4 in both aerobic regulation and glucose derepression. Among the numerous inferred conditionspecific regulatory associations between gene sets and TFs, also many novel putative regulatory mechanisms, such as the possible role of Tye7 in sulfur metabolism, were identified.
-
Background
The simple and often used biological model to unravel
transcriptional regulation ascribes the change in gene
expression levels, as observed between different cellular
conditions, to changes in the activity of transcription
factors (TFs). Change of the transcriptional activity of a TF is
one of the means by which an organism adapts to changes
in the extracellular environment. A substantial amount of
research has employed this model to infer regulatory
networks by integrating gene expression data, sequence data
(to detect the cis-regulatory binding sites of TFs), e.g. [1-3],
and/or TF binding data, e.g. [4-6]. For an overview see
[79]. In most cases, the generated regulatory networks are
derived from large microarray compendia.
Notwithstanding the many advantages of such approaches, two main
drawbacks can be identified. Firstly, these compendia
gather very heterogeneous gene expression data derived
from various culture conditions (media, pH, temperature,
etc.) that, in a large majority of the cases, solely compare
the culture conditions to their direct condition-specific
references. Different cultivation conditions within the
compendium can, therefore, hardly be compared.
Secondly, the interpretation of transcriptome data obtained
from the generally employed shake-flask cultivations is
likely to be complicated by differences in specific growth
rate, carbon catabolite repression, nitrogen catabolite
repression, and more generally continuous changes in
environmental conditions. This prevents the
establishment of a direct link between the activity of TFs and
specific growth conditions.
A frequently employed approach links a TF to a module,
i.e. a set of co-expressed genes, based on TF binding data
or promoter analysis. Enrichment of functional categories
(such as GO [10] and MIPS [11]) within the module
provides clues about the function of the TFs associated with
the module. Although this can provide a global view of
the transcriptional role of a TF, we are convinced that the
precise conditions or perturbations that trigger a change
in the activity of TFs should be an integral part of the
generated regulatory network.
To this end, we designed an experimental setup that
allowed us to explicitly model the cultivation conditions
and use these to infer the activity of TFs. To achieve this,
we employed chemostat cultures that enable the
cultivation of micro-organisms under tightly defined
environmental conditions. Chemostat cultures are superior to the
shake-flask cultures in both accuracy and reproducibility
[12]. In a chemostat, culture broth (including biomass) is
continuously replaced by fresh medium at a fixed and
accurately determined dilution rate. When the dilution
rate is lower than max, the maximal specific growth rate of
the micro-organism, a steady-state situation will be
established in which the specific growth rate equals the
dilution rate. In such a steady-state chemostat culture, , is
controlled by the (low) residual concentration of a single
growth-limiting nutrient. In this research, microarrays
were employed to measure the genome-wide
transcriptional response of the yeast Saccharomyces cerevisiae to
growth limitation by four different macronutrients
(carbon, nitrogen, phosphorus, and sulfur) in both aerobic
and anaerobic chemostat cultures (Figure 1) [13]. Except
for the different nutrient limitations and oxygen
availability, all other culture parameters (such as growth rate, pH,
temperature, etc.) were kept constant throughout the
different experiments. Thus, changes in gene expression
levels can solely be attributed to the different nutrient
limitations and the oxygen regime. We devised a
computational approach that exploits the interrelatedness
between the conditions in order to identify the individual
and combined effects of nutrient limitations and oxygen
availability on expression behavior and TF activity. The
inclusion of the growth conditions in the analysis allows
for the identification of direct links between the
cultivation conditions, TFs triggered by specific cultivation
conditions and the targets of these TFs.
Results
Overview of the computation approach
From the continuous expression levels measured across
the cultivation conditions we derive a discretized
representation of the expression behavior for each gene. This
representation indicates up- or downregulation as a
consequence of the indi (...truncated)