ChiNet uncovers rewired transcription subnetworks in tolerant yeast for advanced biofuels conversion

Nucleic Acids Research, May 2015

Analysis of rewired upstream subnetworks impacting downstream differential gene expression aids the delineation of evolving molecular mechanisms. Cumulative statistics based on conventional differential correlation are limited for subnetwork rewiring analysis since rewiring is not necessarily equivalent to change in correlation coefficients. Here we present a computational method ChiNet to quantify subnetwork rewiring by statistical heterogeneity that enables detection of potential genotype changes causing altered transcription regulation in evolving organisms. Given a differentially expressed downstream gene set, ChiNet backtracks a rewired upstream subnetwork from a super-network including gene interactions known to occur under various molecular contexts. We benchmarked ChiNet for its high accuracy in distinguishing rewired artificial subnetworks, in silico yeast transcription-metabolic subnetworks, and rewired transcription subnetworks for Candida albicans versus Saccharomyces cerevisiae, against two differential-correlation based subnetwork rewiring approaches. Then, using transcriptome data from tolerant S. cerevisiae strain NRRL Y-50049 and a wild-type intolerant strain, ChiNet identified 44 metabolic pathways affected by rewired transcription subnetworks anchored to major adaptively activated transcription factor genes YAP1, RPN4, SFP1 and ROX1, in response to toxic chemical challenges involved in lignocellulose-to-biofuels conversion. These findings support the use of ChiNet in rewiring analysis of subnetworks where differential interaction patterns resulting from divergent nonlinear dynamics abound.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

ChiNet uncovers rewired transcription subnetworks in tolerant yeast for advanced biofuels conversion

Nucleic Acids Research ChiNet uncovers rewired transcription subnetworks in tolerant yeast for advanced biofuels conversion Yang Zhang 1 Z. Lewis Liu 0 Mingzhou Song 1 0 National Center for Agricultural Utilization Research, Agricultural Research Service, U.S. Department of Agriculture , Peoria, IL 61604 , USA 1 Department of Computer Science, New Mexico State University , Las Cruces, NM 88003 , USA *To whom correspondence should be addressed. Tel: +1 575 646 4299; Fax: +1 575 646 1002; Email: Correspondence may also be addressed to Z.L. Liu. Tel: +1 309 681 6294; Fax: +1 309 681 6427; Email: Present address: Yang Zhang. Amyris Inc., Emeryville, CA, USA. - Analysis of rewired upstream subnetworks impacting downstream differential gene expression aids the delineation of evolving molecular mechanisms. Cumulative statistics based on conventional differential correlation are limited for subnetwork rewiring analysis since rewiring is not necessarily equivalent to change in correlation coefficients. Here we present a computational method ChiNet to quantify subnetwork rewiring by statistical heterogeneity that enables detection of potential genotype changes causing altered transcription regulation in evolving organisms. Given a differentially expressed downstream gene set, ChiNet backtracks a rewired upstream subnetwork from a super-network including gene interactions known to occur under various molecular contexts. We benchmarked ChiNet for its high accuracy in distinguishing rewired artificial subnetworks, in silico yeast transcription-metabolic subnetworks, and rewired transcription subnetworks for Candida albicans versus Saccharomyces cerevisiae, against two differential-correlation based subnetwork rewiring approaches. Then, using transcriptome data from tolerant S. cerevisiae strain NRRL Y50049 and a wild-type intolerant strain, ChiNet identified 44 metabolic pathways affected by rewired transcription subnetworks anchored to major adaptively activated transcription factor genes YAP1, RPN4, SFP1 and ROX1, in response to toxic chemical challenges involved in lignocellulose-to-biofuels conversion. These findings support the use of ChiNet in rewiring analysis of subnetworks where differential interaction patterns resulting from divergent nonlinear dynamics abound. Network rewiring refers to changes of network over time by either gain or loss of molecular interactions among distinct taxonomic entities (1). Rewiring of subnetworks, specific components in molecular networks, allows an organism to adapt to a defined environmental condition. Subnetwork rewiring alters moleculemolecule interactions as a second-order change that occurs in either strength or topology of molecular interactions. There must exist some input to which a rewired subnetwork responds differentially from the original subnetwork. We define the first-order change of a subnetwork as working zone change characterized by shift in the probability distributions of molecules in the subnetwork. A modified subnetwork response can be a consequence of either first- or second-order subnetwork changes. Modified metabolic network responses involving glycolysis and pentose phosphate pathways have been defined for a tolerant industrial yeast strain Saccharomyces cerevisiae NRRL Y-50049 under toxic chemical challenges (2). Mechanisms of in situ detoxification by the yeast strain and key regulatory elements involved in its tolerance were also identified (3,4). However, it remains unclear how upstream transcription networks may have been rewired to impact downstream metabolisms that confer toxic tolerance on Y-50049. Large-scale rewiring of transcription programs in response to the loss of a cis-regulatory element was reported to elicit contrasting anaerobic/aerobic growth in yeasts (5). Variations in gene expression responses to stresses among four yeast species have been confirmed to be associated with the disparity of TATA boxes in promoter regions (6). Extracellular signaling was also shown to impose posttranslational modifications on a transcription factor (TF) that reverses its function from activation to repression of gene expression (7). In fact, TF binding was found to be ultra-sensitive to disruptive sequence variations among human genomes (8). These findings raise the possibility that transcription network rewiring may have caused the outstanding detoxification capability of microbial strains for advanced biofuels production (2,4). In this study, we aim to discover rewired upstream molecular subnetworks that alter expression of target gene sets functioning in downstream biological processes at the genome scale. By subnetwork, we mean a context-dependent component of a larger network, such as an active transcription subnetwork. A gene set refers to genes involved in a specifically defined metabolic or signaling pathway. Using cumulative rewiring statistics over a subnetwork, our rationale is to detect subnetworks that are most rewired and also those with individually weak but collectively strong rewiring signals. We aim to uncover both linearly and non-linearly rewired patterns in subnetworks that have been insufficiently addressed by existing approaches focusing mostly on linear interactions. Pathway analysis aims at revealing activities of functional modules and is more robust to noise than genecentric methods. Although tens of existing pathway analysis methods (9,10) are all sensitive to pathway response changes, most are unable to distinguish the source of change from pathway stimuli (first-order) or subnetwork rewiring (second-order). For example, evolved from early methods for over-representation of GO terms (1113), gene set enrichment analyses (1420) detect subnetwork activity by cumulative statistics of differential gene expression. Multivariate analysis of variance (MANOVA) was proposed to study differential expression of gene sets across conditions by implicitly accounting for genegene dependence but not rewiring (21). NetGSA (22) models genegene dependency explicitly for differential gene expression analysis of subnetworks, yet it assumed fixed gene-gene interaction coefficients that do not allow for rewiring analysis. Moving beyond first-order gene-set analysis, current pathway analysis methods detect changes in second-order molecular interactions. Draghici et al. (23,24) developed impact analysis methods to compute perturbation factors to evaluate changes in differentially expressed genes in a pathway with respect to the topology of the pathway. SAMNet (25) solves a multi-commodity network flow problem formulated on the topology of protein and mRNA molecules involved in multiple conditions. Network rewiring is indicated by unequal flows passing through gene interactions associated with each condition. A recent method EDDY (26) statistically evaluates the difference in dependencies among genes in a given set between two conditions based on divergence of posterior probabilities modeled by Bayesian networks. Although these subnetwork analysis methods use both first- and second-order information from subnetwork topologies, they are still not designed to annotate the cause of altered subnetwork responses as being subnetwork input or rewiring. Encouragingly, we start to see methods that are specific to subnetwork rewiring and enable one to separate the driver versus passenger pathways. An early method (27) identifies responsive subnetwork by maximizing a score based on covariance between genes in protein interaction subnetworks, and responsive subnetworks can be independently inferred for each condition and compared for rewiring. The gene set co-expression analysis (GSCA) method (28) sums absolute differential correlation of all gene pairs to obtain a subnetwork dispersion index, but is non-specific to subnetwork topology and favors those differential interactions in the middle of a signaling cascade. Others use ranks of genes to compute pair-wise correlation (29). The method COSINE computes a score from both gene expression and gene interactions to find condition-specific subnetworks (30), where the gene interaction score is an expected F statistic derived from correlation coefficients. DINA (31) detects gene interactions using Spearman correlation coefficients and further utilizes an entropy measure to determine rewiring based on differences in the number of active interactions in each condition. Most recently, the gene set network correlation analysis (GSNCA) method (32) extended GSCA by assigning heavy weights to emphasize hub genes with strong correlations. However, all these subnetwork rewiring analysis methods follow the principle of subtraction of interaction statistics (33). Regardless of linear (e.g. Pearson correlation) or non-linear (e.g. Spearman correlation) statistics, such a principle is indiscriminate to many complex patterns that indeed differ, as our results will demonstrate. Production of advanced biofuels including cellulosic ethanol poses major technical challenges for a sustainable bio-based economy (34,35). Development of the nextgeneration biocatalyst with robust and tolerance characteristics is a necessity for industrial applications. A robust prototype of tolerant industrial yeast S. cerevisiae NRRL Y-50049 was developed through evolutionary engineering to in situ detoxify furfural and 5-hydroxymethylfurfural (HMF), representative toxic inhibitory compounds liberated from lignocellulose biomass pretreatment (3,3639). Strain Y-50049 is able to recover from a short lag phase and completes fermentation in the presence of 20 mM each of furfural and HMF (2). Its parental strain, in contrast, was unable to establish a viable culture after a 48h lag phase and eventually lost function. Although key elements and modified detoxification pathway responses of the tolerant yeast have been described (2,4,40, altered global gene networks and pathways at the genome level that regulate inhibitor tolerance remain largely unknown. How the key regulatory gene YAP1 is wired to impact specific downstream metabolic pathways for yeast tolerance is not clear. Yeast tolerance is a collective phenotype of gene functions and gene interactions in the cellular regulatory system. Genetic variations, causing upstream transcription subnetwork rewiring, consequently alter downstream metabolic pathway responses. Currently available bioinformatics methods are insufficient to dissect components of such adapted transcription networks. The lack of this knowledge hinders genetic engineering and development of the next-generation biocatalyst for advanced biofuels production. Previously, we reported a comparative chi-square analysis (CP 2) for single-interaction rewiring (41). In this work, we present a computational method ChiNet to detect rewired subnetworks of multiple gene interactions in transcription regulation of downstream metabolic pathways for the tolerant strain NRRL Y-50049. We address three technical challenges: (i) developing subnetwork rewiring statistics; (ii) approximating the null distribution of subnetwork heterogeneity; and (iii) incorporating known TFs and their target genes to enhance the estimation of changes in network connectivity. We further demonstrate that ChiNet consistently outperforms differential-correlation based approaches in 60 realistic in silico yeast transcription-metabolic subnetworks. We also benchmarked ChiNet showing its substantial advantage over other rewiring analysis methods by simulated subnetworks with 459 configurations of characteristics. We further validated ChiNet for its high accuracy in distinguishing known rewired transcription subnetworks for Candida albicans versus S. cerevisiae, in comparison to differentialcorrelation based subnetwork rewiring approaches. Finally, we report transcription subnetwork rewiring anchored to adaptively activated key TF genes that potentially impact global adaptation of the tolerant yeast under toxic chemical stresses. This first insight into tolerant gene regulatory network rewiring aids dissection of genomic mechanisms of yeast tolerance and development of the next-generation biocatalyst for sustainable biofuels and a bio-based economy. ChiNet is most suitable for analysis of high-throughput omics data sets from not-well understood organisms and determine the deviation of their molecular subnetworks from related well-characterized organisms. It opens a new avenue to study the evolution of functional subnetworks in biological systems. MATERIALS AND METHODS The ChiNet method for analysis of subnetwork rewiring We first give an overview of the ChiNet method, which decides whether a subnetwork is conserved, or rewired in either topology or interaction strength across two conditions. In the comparative chi-square framework CP 2 first introduced in (41), rewired interactions were characterized by a heterogeneity chi-square. CP 2 can detect single strongly rewired interactions, but was not designed to detect subnetwork rewiring. In ChiNet reported here, we introduce cumulative subnetwork statistics for subnetwork rewiring and gamma distributions to assess their null distributions. The strategy is illustrated in Figure 1. The input to ChiNet is an inclusive subnetwork topology and observed data for nodes in the subnetwork under two experimental conditions. Subnetwork topology is a required input from either prior knowledge or extracted from the data by other means such as backtracking we will present. The subnetwork topology can be extracted from a reference biological network module of either the species in question or its related species from KEGG Pathway (42) or other such databases. ChiNet adapts the topology to select only active interactions associated with current experimental conditions. The output of ChiNet consists of subnetwork homogeneity, heterogeneity and total activity across all experimental conditions. In our investigation of yeast tolerance to toxic chemical compounds, heterogeneity D2 measures subnetwork difference between the two yeast strains in response to the chemical stresses, and homogeneity C2 for the strength of subnetwork similarity. The total subnetwork activity T2 represents overall activity for all conditions. The three subnetwork statistics satisfy a decomposition rule central to ChiNet: which implies that knowing any two statistics can determine the third. We evaluate the significance of the three statistics by gamma distributions to account for statistical dependencies among interactions in subnetworks. Supplementary Figure S1 illustrates that the gamma distribution approximates the null distribution of subnetwork heterogeneity D2 better than the chi-square approximation, when heterogeneity chi-squares of individual nodes in a subnetwork are not all independent. If a subnetwork only contains differentially expressed genes but is not detected as heterogeneous, its activity is most likely a consequence of differential signaling input rather than genetic variation within the subnetwork. Then ChiNet still considers such a subnetwork conserved despite its differentially expressed genes. When a subnetwork topology is dependent on the molecular context and not specified in advance, we extract one from a super network that contains all possible interactions by the BACKTRACK-REWIRED-SUBNETWORKS algorithm. Next we describe technical details of ChiNet. By assessing subnetwork homogeneity and heterogeneity, ChiNet determines whether subnetworks of the same set of nodes are conserved, or rewired in either interaction strength or topology, across two or more conditions. We assume subnetwork G is given with its node set V and edge set E. Often a given subnetwork topology is a superset of all known interactions which can be either active or inactive in the current experiment. To use only active interactions, ChiNet has an option to identify a subnetwork from G that best fits to the current experimental data using a chi-square method (43). This step can be replaced by other network inference methods whose output network topology serves as input to ChiNet. In an interaction, we call the cause variables the parents and the effect variable the child. Let 1 and 2 be the parent sets of child node i under two conditions. We form a pooled r(i) s(i) contingency table: the r(i) rows are values of parents in the union 1 2, and the s(i) columns are the values of child node i. Let [nj[l, m]] (j = 1, 2) be r(i) s(i) contingency tables using the same parent union for child i under each of the two conditions. We measure interaction homogeneity of child node i by Pearsons chi-square l=1 m=1 (n1[l, m] + n2[l, m] n c[l, m])2 (Interaction homogeneity) with vc(i) = (r(i) 1) (s(i) 1) degrees of freedom (d.f.). n c[l, m], the expected count in cell [l, m] under the null hypothesis of parents and child being independent, is nc[l, m] = u=1 (n1[u, m] + n2[u, m]) where n1 and n2 are the sample size for each condition, respectively. Next we assess interaction strength under each condition. Under the null hypothesis that parents and child are independent, the two tables come from the same null distribution of the pooled table. This gives rise to the expected count for each condition under the null hypothesis A chi-square is computed using the observed and expected counts for each condition by l=1 m=1 n j [l , m] n j [l , m] j = 1, 2 with v1(i) = v2(i) = (r(i) 1) (s(i) 1) d.f., respectively. Summing up interaction chi-squares over both conditions, we define the interaction total activity of node i by (Interaction total activity) n j [l, m] = n j u=1 t=1 n1[u, m] + n2[u, m] with vt(i) = v1(i) + v2(i) d.f. Our previous CP 2 work on single interaction comparative chi-square analysis (41) established interaction heterogeneity d2(i ) for each node in a network as follows: which measures how far the interactions deviate from the pooled version. We proved that d2(i ) asymptotically follows a chi-square distribution with vd(i) =vt(i)-vc(i) d.f. when no interactions exist across the conditions (41). Here, parent/cause and child/effect variables are provided on the input subnetwork topology to ChiNet analysis. When a parentchild relationship is one-to-one with no time delay, interaction heterogeneity chi-square statistic d2(i ) is symmetrical with respect to the parent and the child. Thus, d2(i ) reflects changes in the parentchild relationship but does not necessarily make a statement on causality. When a many-to-one or a temporally delayed parent-child relationship is considered, d2(i ) is asymmetric and may reflect causal changes in such a relationship. Now, we extend the three chi-square statistics from the individual interaction level to the subnetwork level. Here, the null hypothesis is no interactions in the subnetwork are active under any condition. We first assume interactions in a subnetwork are statistically independent with respect to each node. We define, across all conditions, the subnetwork total activity chi-square and associated d.f. by iV iV vt(i) (Subnet total activity) which is chi-square distributed with vT d.f. under the null hypothesis. Under the same null hypothesis, we define the subnetwork homogeneity C2 by vc(i) (Subnet homogeneity) iV iV iV iV C2 is chi-square distributed with vC d.f. Now we define the subnetwork heterogeneity D2 by vd (i) (Subnet heterogeneity) which is chi-square distributed with vD d.f. All three subnetwork statistics are chi-square distributed, because the sum of independent chi-square random variables is also chisquared with d.f. being the sum of d.f. for each node (44). From Equation (7), it follows that we can decompose subnetwork total activity T2 into the sum of subnetwork heterogeneity D2 and homogeneity 2 . This gives us the subC network decomposition rule in Equation (1). We use pT, pC and pD to represent the P-values of test statistics T2 , C2 and D2, respectively, calculated from a given sample. In establishing the chi-square null distributions for the above three subnetwork statistics, we have assumed interactions in a subnetwork be statistically independent. Even in an inactive subnetwork of connected nodes where we only observe noise, however, the noise dynamics can be dependent among the nodes. Thus, chi-square approximation of the null distributions can be inaccurate. Chuang and Shih (45) assumed individual chi-squares of 2 d.f. and estimated their correlation coefficients to approximate the dependent chi-square sum by a scaled chi-square distribution. When sample sizes are limited, we found that individual interaction chi-squares of some nodes are not chi-square distributed. Additionally, estimation of the correlation matrix of individual chi-squares tends to be inaccurate. To overcome these issues, we present a gamma approximation for the sum of dependent chi-squares using a bootstrap strategy as shown in Supplementary Algorithm S1. CHINETGAMMA-BOOTSTRAP. To correct for the statistical effect of simultaneous testing on multiple subnetworks, we apply Bonferroni correction to control family-wise error rate or BenjaminiHochberg (46) to control false discovery rate. Both are conservative in Pvalue adjustment but computationally efficient. Finally, we determine if a subnetwork is rewired or conserved based on pD and pC. Let be specified maximum acceptable type I error. The subnetworks are rewired, if pD ; or conserved, if pD > and pC . It is useful to point out that two rewired subnetworks may have strong heterogeneity and homogeneity at the same time; two conserved subnetworks can only have strong homogeneity. Performance evaluation of ChiNet and comparison with other methods using simulated and experimental benchmark datasets We first evaluated the performance of ChiNet in reference to GSCA and GSNCA, two differential-correlation based subnetwork rewiring methods, on simulated yeast transcription-metabolic networks. Then we benchmarked ChiNet, GSCA and GSNCA under 459 simulation settings associated with four network characteristics: noise level, sample size, complexity of dynamics (number of quantization levels) and subnetwork sparsity (number of parents, or in-degree, per child node). Both studies used a house noise model (Supplementary Note S1). Finally, we evaluated ChiNet, GSCA and GSNCA to identify rewired subnetworks among mitochondria ribosome protein (MRP), cytoplasmic ribosome protein (RP), rRNA genes and their TFs using microarray gene expression data collected from two yeast species fungus pathogen C. albicans and S. cerevisiae. Full detail about the three performance evaluation studies is described in Supplementary Notes S2, S3, and S4. Biological experimental design and data collection An industrial yeast strain S. cerevisiae NRRL Y-12632 and its inhibitor-tolerant derivative NRRL Y-50049 obtained through evolutionary engineering (Agricultural Research Service Culture Collection, Peoria, IL, USA) were used in this study. Experimental design, microarray gene expression, outlier processing, normalization, discretization and gene selection are described in full detail in Supplementary Note S5. Backtracking upstream transcription subnetworks from downstream metabolic pathways To identify upstream transcription subnetworks that may have induced downstream metabolic responses during adaptive growth against biomass conversion inhibitors in yeast, we backtrack shortest paths linking a differential transcription interaction to genes with differential metabolic responses in Y-50049. From YEASTRACT, we identified 183 TFs and 42,524 TF-gene pairs of documented transcription regulatory interactions. This constitutes the known network topology for our study. Using this TF network topology, we first build a super graph that fits the data of the two strains the best using a chi-square test (43). Then we detect all differential gene interactions between the two strains. For every differentially expressed gene on a given downstream metabolic pathway, we find a shortest path to this gene from the closest upstream TF that is involved in a differential interaction by Dijkstras algorithm. A subnetwork is obtained by joining all such shortest paths reaching a common metabolic pathway. Then we assess by ChiNet if the subnetwork is statistically significantly rewired across the two strains. The algorithm is presented as Supplementary Algorithm S2. BACKTRACK-REWIREDSUBNETWORKS. The underlying assumption is that a differentially expressed enzyme is caused by the most adjacent rewired upstream transcription regulation. Biological validation For biological validation on tolerance impact of TF genes detected by this study, we examined six single gene deletion mutations from Saccharomyces Genome Deletion Sets for growth response to challenges of 10 mM each of furfural and HMF on a synthetic medium. A wild-type S. cerevisiae strain BY4742 (MAT his31 leu20 lys20 ura30) grown with and without the inhibitor challenges served as a control. Each tested strain was grown on 4 ml synthetic medium in a 15 ml tube at 30C with agitation of 250 rpm. Cell growth was monitored by absorbance at 600 nm. Cells grown without the inhibitor challenges served as controls. Experiments were repeated for all tests. Advantage of ChiNet by in silico benchmarking We first demonstrate the capability of ChiNet to identify nonlinear differential interaction patterns in in silico yeast subnetworks. With 60 known yeast metabolic pathways in KEGG Pathway (42) and their upstream transcription subnetworks from YEASTRACT (47), we artificially created 60 pairs of rewired and another 60 pairs of conserved dynamic subnetworks using the generalized logical network (GLN) model (43). From these models, we simulated dynamic data at different levels of noise. Here we compare ChiNet, GSCA and its two variants, and also GSNCA. Extending the original GSCA based on linear correlation, the GSCA-order1 variant examines temporal dependencies and the GSCA-Spearman variant integrates temporal dependencies, subnetwork topology and non-linear correlation. On data from each pair of subnetwork models, we applied ChiNet, the GSCA cohort and GSNCA to determine if the pair of underlying subnetworks is rewired or conserved. Figure 2 shows the advantage of ChiNet over the GSCA cohort and GSNCA in receiver operating characteristic (ROC) curves over a wide range of noise levels. The area under the ROC curve (AUROC) of value 1 indicates a perfect performance, 0.5 for a random guess and 0 for a systematic error. As the noise level inflates from 0.2 to 0.45, the gain in AUROC by ChiNet over the GSCA cohort also increases from 0.010.05 to 0.180.27; GSNCA did not function well, with AUROC notably less than ChiNet or GSCA at all noise levels in this study. Thus this result demonstrates potentially outstanding robustness of ChiNet to noise in realistic biological networks. To evaluate sensitivity of ChiNet, GSCA and GSNCA to various subnetwork characteristics, we benchmarked their performance by a second simulation study. We generated a total of 91 800 pairs of subnetworks under 459 simulation settings characterized by noise level, complexity of interaction dynamics (as indicated by the number of quantization levels), sample size and sparsity of network topology (as indicated by the in-degrees of each node). Figure 3a illustrates ROC curves of the three types of methods at a specific simulation setting, demonstrating the notable strength of ChiNet. The distributions and box plots of empirical AUROC for each method are shown in Figure 3b and c. The mean AUROC over all 459 settings in decreasing order was observed at 0.77 for ChiNet, followed by 0.60, 0.63 and 0.64 for GSCA, GSCA-order1, GSCA-Spearman and 0.53 for GSNCA, respectively. Thus, ChiNet here demonstrates a large margin of effectiveness over differential-correlation based methods. The fundamental limitation of differential correlation employed by GSCA and GSNCA is illustrated by an example in Figure 4. As GSCA uses the dispersion indexthe summation of the squares of differences in pairwise correlation coefficients between conditionsto evaluate network rewiring, it can be insensitive to complex pattern differences in rewired subnetworks. In this example, truly rewired subnetworks scored an undesired zero dispersion index (Figure 4). Although GSNCA uses L1 norm and weighs differential correlation discriminatively for each node in the subnetwork, it still shares the same limitation with GSCA as correlation coefficients are subtracted in both methods. As a result, ChiNet outperformed the GSCA cohort and GSNCA at a markedly large margin. Validating ChiNet using transcription subnetworks rewired between two yeasts To understand how evolution may have rewired gene regulatory networks connecting TFs and their target genes between C. albicans and S. cerevisiae, we applied ChiNet to gene subsets that contain either diverged or conserved sequence motifs in their promoter regions on a gene expression microarray compendium including 1011 S. cerevisiae and 198 C. albicans samples (5). Loss of cis-regulatory elements for MRP genes due to genome evolution has been linked to rapid anaerobic growth in S. cerevisiae relative to other aerobic yeast species (5). Although gene clusters corresponding to differentially correlated expression patterns have been identified, expression patterns of these clusters do not directly suggest TF-gene rewiring. Both rewired and not-rewired genes expressed differentially between the two species (Supplementary Figures S11, S12 and S14). Meanwhile, their known TFs are equally enriched in the two species (Supplementary Figures S13 and S14). This implies that analyzing gene set enrichment by differential expression without looking at interaction patterns here would not logically lead to evidence for rewiring. Thus, we inspected rewired interaction patterns in subnetworks. Our analysis (Supplementary Table S1) shows that the transcription subnetwork connecting MRP genes and their TFs are highly rewired with a normalized subnetwork heterogeneity chi-square of 53 (P-value = 0) between C. albicans and S. cerevisiae (Figure 5a). On the other hand, the transcription subnetwork connecting cytoplasmic ribosome protein (RB) and rRNA genes and their TFs are mostly not rewired (Supplementary Figure S2) with a normalized subnetwork heterogeneity chi-square of 10 (P-value = 4 1011). These findings by ChiNet are consistent with and complementary to transcription regulation rewiring suggested by the extent of sequence motif conservation (5). The rewired MRP gene regulation most likely contributes to the different capability for rapid anaerobic growth of S. cerevisiae versus aerobic growth of C. albicans. Using this dataset, we again found ChiNet remarkably outperformed GSCA and GSNCA using the partially confirmed rewired and conserved genes between the two yeasts as a gold standard (5) (see Supplementary Note S4). ROC and precision-recall (PR) curves for all three methods (Supplementary Figure S3 to S7) were plotted under five values of subnetwork rewiring heterogeneity, which is defined as the ratio of rewired genes to the total number of genes excluding TFs in the subnetwork. Figure 5b and c shows AUROC and area under PR (AUPR) as a function of subnetwork rewiring heterogeneity for the three methods. ChiNet exhibits a highly consistent advantage over GSCA and GSNCA at increasing subnetwork rewiring heterogeneity. Contrary to the two simulation studies, GSNCA performed better than GSCA here and demonstrates an advantage due to the implicit use of subnetwork topology. The dramatic under-performance of GSCA and GSNCA (Figure 5b and c) can be partially explained by false positives introduced by large differential correlations when the dynamic range of genes in one condition is fully covered by a larger dynamic range of another condition. In Figure 5d, e, and f, the expression pattern between transcription factor gene XBP1 and a target gene EBP2 (in the not-rewired gene group) in C. albicans is almost entirely enclosed within the pattern of S. cerevisiae. ChiNet did not score the two patterns high for rewiring because they do not contradict each other. However, the large differential XBP1-EBP2 correlation of | 0.51 0.11| = 0.62 would amount to falsely strong evidence for a rewired interaction across the two yeasts. There are a number of genes with overlapping dynamic interaction patterns in the not-rewired gene group and thus led to the poor overall performance of GSCA and GSNCA. Globally rewired gene networks in the tolerant yeast Applying ChiNet on transcriptome data of yeast in response to furfural and HMF, we found that the tolerant strain Y-50049 displayed significant alterations on gene regulatory networks at the global scale compared with its parental wild-type strain Y-12632. At least 44 pathways (Supplementary Table S2) were detected to significantly involve rewired upstream transcription subnetworks (Figure 6). The oxidative phosphorylation pathway was detected to have the greatest differential expression between the two strains as suggested by its highly significant working zone change P-value (Supplementary Note S6). In addition to important central metabolic pathways, almost all amino acid metabolic pathways were affected, which represent comprehensive alterations of biosynthesis activities in the tolerant yeast. Other downstream pathways significantly affected by their upstream transcription subnetworks were involved in fatty acid metabolism and glycerolipid metabolism. Transcription factor gene YAP1 appeared to be the most dominant regulatory gene for Y-50049 in adaptation to the toxic compounds furfural and HMF. Its adaptive signature expression impacted at least 39 downstream pathways. Among them, the glycolysis and pentose phosphate pathways showed high statistical significance in upstream transcription subnetwork heterogeneity (Supplementary Table S2). The pentose phosphate pathway has a highly significant rewired upstream transcription network (P-value 1.97 1014) between the tolerant yeast strain Y-50049 and the wild-type (Figure 7). Eighteen TFs are involved and most rewired TF-enzyme interactions originated from YAP1 and IFH1. Another key regulatory gene RPN4 was observed to be adaptively activated and affected more than 20 downstream pathways through enhanced activity of at least three downstream interactions of RPN4YOX1, REB1RAP1 and MAL33ABF1. However, altered regulatory interactions observed in Y-50049 genome adaptation were not limited to enhanced gene expression. As indicated by the rewired networks, regulatory genes with normally expressed and downregulated expression may also serve regulatory functions in adaptation to the furfural-HMF stress. The activated TF gene SFP1 rooted more than 30 downstream pathways, including major biosynthesis and central metabolic pathways, through differential and conserved interactions including several regulatory genes with downregulated expression (Figure 6). For example, downstream of SFP1, TF gene IFH1 was observed to be repressed but led to activation of TYE7 and altered downstream interactions, including many amino acid metabolism pathways. TF gene ROX1 also appeared to play an important role in the yeast adaptation involving at least 20 pathways. Under the challenge of furfural and HMF, ROX1 was normally expressed mediating both altered and conserved interactions of genes and pathways. Downregulated TF gene CIN5, served as both a regulon and a regulator, was linked to up-regulated gene responses and conserved pathway interactions. We performed single-gene-deletion mutations on selective TF genes including YAP1, RPN4, MSN4, ROX1, SFP1 and CIN5 to confirm gene functions in response to the toxic compounds. Strains with these mutations were all able to grow normally on a minimum medium (Supplementary Figure S8A). But when the medium was supplemented with furfural and HMF these strains were significantly repressed or unable to grow compared with a wild type control (Supplementary Figure S8B). For example, mutation strains with YAP1 knockout failed to grow on an inhibitor-containing medium at 72 h. TF gene YAP1 activates response of anti-oxidant genes by recognizing a Yap1p response element (YRE), 5 -TKACTMA-3 , in the promoter region. YAP1 was identified as a major responsible regulator for yeast tolerance to the inhibitors (4,4849). Many genes showing induced expression possess the YRE sequence in their promoter region. Most YAP1-regulated genes were classified in a broad range of functional categories including redox metabolism, amino acid metabolism, stress response, DNA repair and others. Modified responses of glucose metabolic pathways for the tolerant yeast were defined in detail involving many genes with reductase activity and four major cofactor regeneration steps (2,50). Recent results of engineering efforts were consistent with these findings (51,52). While a wild-type strain was repressed to die under challenge of the toxic chemicals, the tolerant strain equipped with the reprogrammed glucose metabolic pathways detoxified the inhibitors in situ and produced ethanol. Glycolysis is one of the central metabolic pathways for cell survival and function. Our results of this study identified glycolysis as one of the most significant downstream pathways likely to be affected by rewired regulation and coregulation of YAP1. In addition to the indispensable YAP1, we found TF gene SFP1 and RPN4 as key regulatory genes affect downstream metabolic pathways such as pentose phosphate pathway (Figure 7) and many amino acid metabolism pathways (Figure 6; Supplementary Table S1). A functional pentose phosphate pathway is necessary for yeast tolerance involving both detoxification and damage repairs (2,5359). In this study, we found the tolerant response of this pathway was mediated by altered RPN4 expression and through more downstream regulatory interactions including REB1 and RAP1. Chemical stress causes reactive oxygen species and damages RNA and protein conformation leading to protein unfolding and aggregation (54,60). Many candidate genes were found to have a proteasome-associated control element of Rpn4p in promoter regions and are potentially regulated by RPN4 (4,6162). Adapted RPN4 expression by the chemical stress in the tolerant yeast apparently played a major regulatory role leading to a functional pentose phosphate pathway as suggested by this study. The highly sensitive response of these gene deletion mutation strains to the toxic chemicals further confirmed the essential roles of each gene involved in the rewired programs for the tolerant yeast. Our results suggest these TF genes are essential regulators for the yeast survival against the toxic compounds. ChiNet developed in this study is innovative in pooling samples from all conditions to one contingency table, where conserved patterns reinforce each other while differential patterns cancel out. This allows ChiNet to detect fundamental interaction pattern rewiring in subnetworks that drive observed differential expression. Accurate detection of subnetwork rewiring enables a specific component in a network to be linked to changed biological function due to evolution. ChiNet consistently outperformed previously reported methods based on differential correlation, including GSCA, its variants and GSNCA in both simulation and real experimental data studies. We observed that two GSCA variants with more ground-truth information did not improve much over the original GSCA, which was unexpected. It is known that zero differential correlation is neither a sufficient nor a necessary condition for the same slope for two linear patterns (63). The overall similarity in performance among GSNCA, GSCA and its variant GSCA-order1 suggests that differential correlation may constitute the bottleneck, despite the correct Markovian order being used in GSCA-order1. The GSCA-Spearman variant computes nonlinear Spearman correlation coefficients with correct subnetwork topology, but was not able to improve upon GSCA or GSCA-order1. A possible explanation is that a pair of nodes indirectly connected via a path can still be sensitive to differential correlation along the path. In addition, non-linear correlation coefficients may compress interaction patterns even further, resulting in true differential interaction patterns mapping to a similar value and becoming indistinguishable. For example, all monotonically increasing patterns representing very different interaction dynamics display equal Spearman correlation coefficients. Although the recently developed GSNCA method showed improved capability over GSCA through implicitly integrated network topology, our results suggest that summarizing an interaction pattern by correlation coefficient followed by comparing the statistic across conditions is fundamentally ineffective to capture diverse interaction patterns that may share similar correlation coefficient values. Integration of the rewired subnetworks into a global network (Figure 6) enabled us to zoom into a small number of highly involved TF genes as hubs. Although only YAP1 has been elucidated to be activated in oxidative responses specifically due to furfural and HMF (64) as thiolreactive electrophiles (65), many of the hub genes, yet to be studied for their biochemistry with furfural and HMF, are known to be involved in various stress responses in yeast. Another hub gene RAP1 coordinates IFH1 binding to ribosome protein to regulate protein synthesis in response to growth stimuli and environmental stresses (66). RAP1 also directly regulates YDR248C in the pentose phosphate pathway (Figure 7). The protein product of PUT3, with rewired links to the proline metabolism pathway, activates PUT1 and PUT2 which encode enzymes of proline utilization (67). Proline is a protectant against stresses including freezing, desiccation, oxidation and ethanol in yeast (68). MAL33 is activated at low glucose levels so that other sources of sugar such as maltose and galactose can be utilized (69). CIN5, with sequence homologous to YAP1, increased resistance to Cisplatin when over-expressed in S. cerevisiae (70). RPN4 promotes DNA repair, antioxidant response and glucose metabolism under genotoxic stresses (71). Direct evidence (72) implicated MSN2/MSN4 in induction of NTH1 that controls trehalose hydrolysis under heat and osmotic stresses. ROX1 is a transcriptional regulator of oxidative responses and up-regulated about 30 genes involved in cell stress under anaerobic conditions (73). Together, the functional coincidence of these TF genes in various stress responses supports the global view highlighted by our ChiNet analysis that ties these genes and their downstream metabolic pathways. The absence of a single upstream hub gene seems to imply that multiple genomic loci may be responsible for the rewired transcription program in Y-50049. Thus, such rewiring most likely confers tolerance of toxic furfural and HMF on the yeast strain Y-50049. Several considerations are to be made before applying ChiNet to detect subnetwork rewiring. First, the sample size needs to be sufficient such that the expected number of samples in each contingency table entry is at least five. Second, ChiNet discretizes molecular abundance to represent flexibly not-well-understood interaction patterns so as to prevent biases associated with an unvalidated parametric model. Quantization despite sacrifice in data resolution can be beneficial due to its noise removal effect, as demonstrated in Supplementary Note S4 where the minimal quantization level of two achieved the best performance in both AUROC and AUPR. If interaction patterns already have valid parametric models, information loss due to discretization can be avoided by alternative methods. For example, comparative dynamical system modeling (63) characterizes interaction heterogeneity from continuous time course data with nonlinear parametric models. Finally, ChiNet requires a subnetwork topology in the form of a directed graph. If a topology is unavailable but subnetwork rewiring analysis is desired, one can use a network inference method as a first step to reconstruct the topologies from observed data. Then as the second step in this workflow, one applies ChiNet to perform subnetwork rewiring analysis. ChiNet is applicable to integrate both proteome and transcriptome data. Specifically, one can use the protein abundance of TFs as parent variables and RNA abundance of target genes as child variables in contingency tables to compute the interaction heterogeneity chi-square statistic defined in Equation (7). On transcriptomic data alone without protein activity information, the outcome of rewiring analysis may be incomplete when post-transcription regulation leads to non-monotonic or even random translational patterns. For example, a recent study on the proteogenome of human colon and rectal cancers revealed a positive sample-wise mRNAprotein correlation but also observed that mRNA abundance is not a reliable predictor of protein variation for individual genes (74). However, highly positively correlated mRNA and protein abundance were observed for most genes in fission yeast (75). Our benchmark study using transcriptome data alone from C. albicans versus S. cerevisiae by ChiNet highlighted subnetwork rewiring consistent with changed genotypes. The value of ChiNet lies in pointing to such candidate subnetworks for further investigation. In conclusion, our benchmark studies have demonstrated substantially improved subnetwork rewiring analysis accuracy of ChiNet over alternative methodologies. Further, ChiNet revealed transcription subnetwork rewiring of the molecular mechanisms underlying yeast tolerance and robust strains development for advanced biofuels production. ChiNet is readily applicable to integrate transcriptomic, proteomic and metabolomic data to understand network rewiring fundamental to the evolution of biological systems. The ChiNet software is freely available to non-commercial users at ACCESSION NUMBERS The tolerant yeast microarray data have been deposited in NCBIs Gene Expression Omnibus (76) and are accessible through GEO Series accession number GSE50492 (http:// SUPPLEMENTARY DATA Supplementary Data are available at NAR Online. ACKNOWLEDGEMENTS The authors thank the anonymous reviewers for their feedback to improve the quality of this manuscript. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture. USDA is an equal opportunity provider and employer. USDA National Research Initiative [2006-35504-17359, in part], NIH National Cancer Institute [1U54CA132383]; NSF CREST Center for Bioinformatics and Computational Biology [HRD-0420407]; NSF MRI [CNS-1337884]; NIH New Mexico IDeA Networks of Biomedical Research Excellence [2P20GM103451-14]; NIH Mountain West Clinical Translational Research [1U54GM104944-2]. Funding for open access charge: NIH [1U54GM104944-2]. Conflict of interest statement. None declared. REFERENCES 1. Sun , M.G. , Sikora , M. , Costanzo , M. , Boone , C. and Kim , P.M. ( 2012 ) Network evolution: rewiring and signatures of conservation in signaling . PLoS Comput. Biol ., 8 , e1002411 . 2. Liu , Z.L. , Ma , M. and Song , M. ( 2009 ) Evolutionarily engineered ethanologenic yeast detoxifies lignocellulosic biomass conversion inhibitors by reprogrammed pathways . Mol. Genet. Genomics , 282 , 233 - 244 . 3. Liu , Z.L. , Moon , J. , Andersh , B. , Slininger , P.J. and Weber , S. ( 2008 ) Multiple gene-mediated NAD(P)H-dependent aldehyde reduction is a mechanism of in situ detoxification of furfural and 5-hydroxymethylfurfural by Saccharomyces cerevisiae . Appl. Microbiol. Biotechnol. , 81 , 743 - 753 . 4. Ma , M. and Liu , Z.L. ( 2010 ) Comparative transcriptome profiling analyses during the lag phase uncover YAP1, PDR1, PDR3, RPN4, and HSF1 as key regulatory genes in genomic adaptation to the lignocellulose derived inhibitor HMF for Saccharomyces cerevisiae . BMC Genomics , 11 , 660 . 5. Ihmels , J. , Bergmann , S. , Gerami-Nejad , M. , Yanai , I. , McClellan , M. , Berman , J. and Barkai , N. ( 2005 ) Rewiring of the yeast transcriptional network through the evolution of motif usage . Science , 309 , 938 - 940 . 6. Tirosh ,I., Weinberger , A. , Carmi , M. and Barkai , N. ( 2006 ) A genetic signature of interspecies variations in gene expression . Nat. Genet. , 38 , 830 - 834 . 7. Filtz , T.M. , Vogel , W.K. and Leid , M. ( 2014 ) Regulation of transcription factor activity by interconnected post-translational modifications . Trends Pharmacol. Sci. , 35 , 76 - 85 . 8. Khurana , E. , Fu , Y. , Colonna , V. , Mu , X.J. , Kang , H.M. , Lappalainen , T. , Sboner , A. , Lochovsky , L. , Chen , J. , Harmanci , A. et al. ( 2013 ) Integrative annotation of variants from 1092 humans: Application to cancer genomics . Science , 342 , 1235587 . 9. Khatri , P. , Sirota , M. and Butte , A.J. ( 2012 ) Ten years of pathway analysis: current approaches and outstanding challenges . PLoS Comput. Biol ., 8 , e1002375 . 10. Mitra , K. , Carvunis , A.-R. , Ramesh , S.K. and Ideker , T. ( 2013 ) Integrative approaches for finding modular structure in biological networks . Nat. Genet. , 14 , 719 - 732 . 11. Khatri , P. , Draghici , S. , Ostermeier , G.C. and Krawetz , S.A. ( 2002 ) Profiling gene expression using onto-express . Genomics , 79 , 266 - 270 . 12. Draghici , S. , Khatri , P. , Martins , R.P. , Ostermeier , G.C. and Krawetz , S.A. ( 2003 ) Global functional profiling of gene expression . Genomics , 81 , 98 - 104 . 13. Backes , C. , Keller , A. , Kuentzer , J. , Kneissl , B. , Comtesse , N. , Elnakady , Y.A. , Mu ller,R., Meese , E. and Lenhof , H.-P. ( 2007 ) GeneTrail-advanced gene set enrichment analysis . Nucleic Acids Res ., 35 , W186 - W192 . 14. Subramanian , A. , Tamayo , P. , Mootha , V.K. , Mukherjee , S. , Ebert , B.L. , Gillette , M.A. , Paulovich , A. , Pomeroy , S.L. , Golub, T.R. , Lander , E.S. et al. ( 2005 ) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles . Proc. Natl. Acad. Sci. U.S.A. , 102 , 15545 - 15550 . 15. Yi , M. and Stephens , R.M. ( 2008 ) SLEPR: a sample-level enrichment-based pathway ranking method--seeking biological themes through pathway-level consistency . PLoS One , 3 , e3288 . 16. Irizarry , R.A. , Wang , C. , Zhou , Y. and Speed , T.P. ( 2009 ) Gene set enrichment analysis made simple . Stat. Methods Med. Res ., 18 , 565 - 575 . 17. Simon , R. , Lam , A. , Li , M.-C. , Ngan , M. , Menenzes , S. and Zhao , Y. ( 2007 ) Analysis of gene expression data using BRB-array tools . Cancer Inform ., 3 , 11 - 17 . 18. Yi , M. , Mudunuri , U. , Che, A. and Stephens , R.M. ( 2009 ) Seeking unique and common biological themes in multiple gene lists or datasets: pathway pattern extraction pipeline for pathway-level comparative analysis . BMC Bioinformatics , 10 , 200 . 19. Sartor , M.A. , Mahavisno , V. , Keshamouni , V.G. , Cavalcoli , J. , Wright , Z. , Karnovsky , A. , Kuick , R. , Jagadish , H.V. , Mirel , B. , Weymouth, T. et al. ( 2010 ) ConceptGen: a gene set enrichment and gene set relation mapping tool . Bioinformatics , 26 , 456 - 463 . 20. Poisson , L.M. , Sreekumar , A. , Chinnaiyan , A.M. and Ghosh , D. ( 2012 ) Pathway-directed weighted testing procedures for the integrative analysis of gene expression and metabolomic data . Genomics , 99 , 265 - 274 . 21. Hwang , T. and Park , T. ( 2009 ) Identification of differentially expressed subnetworks based on multivariate ANOVA . BMC Bioinformatics , 10 , 128 . 22. Shojaie , A. and Michailidis , G. ( 2010 ) Network enrichment analysis in complex experiments . Stat. Appl. Genet. Mol. Biol ., 9 , Article ID 22. 23. Draghici , S. , Khatri , P. , Tarca , A.L. , Amin , K. , Done , A. , Voichita , C. , Georgescu , C. and Romero , R. ( 2007 ) A systems biology approach for pathway level analysis . Genome Res ., 17 , 1537 - 1545 . 24. Tarca , A.L. , Draghici , S. , Khatri , P. , Hassan , S.S. , Mittal , P. , Kim , J.-S. , Kim , C.J. , Kusanovic , J.P. and Romero , R. ( 2009 ) A novel signaling pathway impact analysis . Bioinformatics , 25 , 75 - 82 . 25. Gosline , S.J. , Spencer , S.J. , Ursu , O. and Fraenkel , E. ( 2012 ) SAMNet: a network-based approach to integrate multi-dimensional high throughput datasets . Integr. Biol. (Camb) , 4 , 1415 - 1427 . 26. Jung , S. and Kim , S. ( 2014 ) EDDY: a novel statistical gene set test method to detect differential genetic dependencies . Nucleic Acids Res ., 42 , e60. 27. Guo , Z. , Li , Y. , Gong , X. , Yao , C. , Ma , W. , Wang , D. , Li , Y. , Zhu , J. , Zhang , M. , Yang , D. et al. ( 2007 ) Edge-based scoring and searching method for identifying condition-responsive protein-protein interaction sub-network . Bioinformatics , 23 , 2121 - 2128 . 28. Choi , Y. and Kendziorski , C. ( 2009 ) Statistical methods for gene set co-expression analysis . Bioinformatics , 25 , 2780 - 2786 . 29. Alvo , M. , Liu , Z. , Williams , A. and Yauk , C. ( 2010 ) Testing for mean and correlation changes in microarray experiments: an application for pathway analysis . BMC Bioinformatics , 11 , 60 . 30. Ma , H. , Schadt , E.E. , Kaplan , L.M. and Zhao , H. ( 2011 ) COSINE: COndition-SpecIfic sub-NEtwork identification using a global optimization method . Bioinformatics , 27 , 1290 - 1298 . 31. Gambardella , G. , Moretti , M.N. , de Cegli , R. , Cardone , L. , Peron , A. and di Bernardo , D. ( 2013 ) Differential network analysis for the identification of condition-specific pathway activity and regulation . Bioinformatics , 29 , 1776 - 1785 . 32. Rahmatallah , Y. , Emmert-Streib , F. and Glazko , G. ( 2014 ) Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets . Bioinformatics , 30 , 360 - 368 . 33. Ideker , T. and Kroganb , N.J. ( 2012 ) Differential network biology . Mol. Syst. Biol ., 8 , 565 . 34. Wall , J.D. , Harwood , C. and Demain , D. ( 2008 ) Bioenergy. American Society for Microbiology Press, Washington DC. 35. Vertes , A.A. , Qureshi , N. and Yukawa , H. ( 2010 ) Biomass to Biofuels: Strategies for Global Industries , Wiley, Chichester. 36. Larsson , S. , Palmqvist , E. , Hahn-Ha gerdal, B. , Tengborg , C. , Stenberg , K. , Zacchi , G. and Nilvebrant , N.-O. ( 1999 ) The generation of fermentation inhibitors during dilute acid hydrolysis of softwood . Enzyme Microb. Technol. , 24 , 151 - 159 . 37. Klinke , H.B. , Thomsen , A. and Ahring , B.K. ( 2004 ) Inhibition of ethanol-producing yeast and bacteria by degradation products produced during pre-treatment of biomass . Appl. Microbiol. Biotechnol. , 66 , 10 - 26 . 38. Liu , Z.L. , Slininger , P.J. and Gorsich , S.W. ( 2005 ) Enhanced biotransformation of furfural and hydroxymethylfurfural by newly developed ethanologenic yeast strains . Appl. Biochem. Biotechnol. , 121 - 124 , 451 - 460 . 39. Liu , Z.L. and Blaschek , H.P. ( 2010 ) Biomass conversion inhibitors and in situ detoxification . In: Biomass to Biofuels: Strategies for Global Industries. Blackwell Publishing Ltd., Chichester , pp. 233 - 259 . 40. Liu , Z.L. ( 2011 ) Molecular mechanisms of yeast tolerance and in situ detoxification of lignocellulose hydrolysates . Appl. Microbiol. Biotechnol. , 90 , 809 - 825 . 41. Song , M. , Zhang , Y. , Katzaroff , A.J. , Edgar , B.A. and Buttitta , L. ( 2014 ) Hunting complex differential gene interaction patterns across molecular contexts . Nucleic Acids Res ., 42 , e57. 42. Kanehisa , M. , Goto , S. , Sato , Y. , Furumichi , M. and Tanabe , M. ( 2012 ) KEGG for integration and interpretation of large-scale molecular data sets . Nucleic Acids Res ., 40 , D109 - D114 . 43. Song , M. , Lewis , C.K. , Lance , E.R. , Chesler , E.J. , Yordanova , R.K. , Langston , M.A. , Lodowski , K.H. and Bergeson , S.E. ( 2009 ) Reconstructing generalized logical networks of transcriptional regulation in mouse brain from temporal gene expression data . EURASIP J. Bioinform. Syst. Biol ., 2009 , Article ID 545176. 44. Casella , G. and Berger , R.L. ( 1990 ) Statistical Inference, Duxbury Press, Belmont, CA. 45. Chuang , L. and Shih , Y. ( 2012 ) Approximated distributions of the weighted sum of correlated chi-squared random variables . J. Stat. Plan. Inference , 142 , 457 - 472 . 46. Benjamini , Y. and Hochberg , Y. ( 1995 ) Controlling the false discovery rate: a practical and powerful approach to multiple testing . J. R. Stat. Soc. B , 57 , 289 - 300 . 47. Teixeira , M.C. , Monteiro , P.T. , Guerreiro , J.F. , Goncalves ,J.P., Mira , N.P. , dos Santos , S.C. , Cabrito , T.R. , Palma , M. , Costa , C. , Francisco , A.P. et al. ( 2014 ) The YEASTRACT database: an upgraded information system for the analysis of gene and genomic transcription regulation in Saccharomyces cerevisiae . Nucleic Acids Res ., 42 , D161 - D166 . 48. Lin , F.-M. , Qiao , B. and Yuan , Y.-J. ( 2009 ) Comparative proteomic analysis of tolerance and adaptation of ethanologenic Saccharomyces cerevisiae to furfural, a lignocellulosic inhibitory compound . Appl. Environ. Microbiol ., 75 , 3765 - 3776 . 49. Gulshan , K. , Lee , S.S. and Moye-Rowley , W.S. ( 2011 ) Differential oxidant tolerance determined by the key transcription factor Yap1 is controlled by levels of the Yap1-binding protein, Ybp1 . J. Biol . Chem., 286 , 34071 - 34081 . 50. Jordan , D.B. , Braker , J.D. , Bowman , M.J. , Vermillion , K.E. , Moon , J. and Liu , Z.L. ( 2011 ) Kinetic mechanism of an aldehyde reductase of Saccharomyces cerevisiae that relieves toxicity of furfural and 5-hydroxymethylfurfural . Biochim. Biophys. Acta , 1814 , 1686 - 1694 . 51. Moon , J. and Liu , Z.L. ( 2012 ) Engineered NADH-dependent GRE2 from Saccharomyces cerevisiae by directed enzyme evolution enhances HMF reduction using additional cofactor NADPH. Enzyme Microb . Technol., 50 , 115 - 120 . 52. Jayakody , L.N. , Horie , K. , Hayashi , N. and Kitagaki , H. ( 2013 ) Engineering redox cofactor utilization for detoxification of glycolaldehyde, a key inhibitor of bioethanol production, in yeast Saccharomyces cerevisiae . Appl. Microbiol. Biotechnol. , 97 , 6589 - 6600 . 53. Gorsich , S.W. , Dien , B.S. , Nichols , N.N. , Slininger , P.J. , Liu , Z.L. and Skory , C.D. ( 2006 ) Tolerance to furfural-induced stress is associated with pentose phosphate pathway genes ZWF1, GND1, RPE1, and TKL1 in Saccharomyces cerevisiae . Appl. Microbiol. Biotechnol. , 71 , 339 - 349 . 54. Allen , S.A. , Clark , W. , McCaffery , J.M. , Cai , Z. , Lanctot , A. , Slininger , P.J. , Liu , Z.L. and Gorsich , S.W. ( 2010 ) Furfural induces reactive oxygen species accumulation and cellular damage in Saccharomyces cerevisiae . Biotechnol. Biofuels , 3 , 1 - 10 . 55. Hasunuma , T. , Sanda , T. , Yamada , R. , Yoshimura , K. , Ishii , J. and Kondo , A. ( 2011 ) Metabolic pathway engineering based on metabolomics confers acetic and formic acid tolerance to a recombinant xylose-fermenting strain of Saccharomyces cerevisiae . Microb. Cell Fact. , 10 , 2 - 13 . 56. Ding , M.-Z. , Wang , X. , Liu , W. , Cheng , J.-S. , Yang , Y. and Yuan , Y.-J. ( 2012 ) Proteomic research reveals the stress response and detoxification of yeast to combined inhibitors . PLoS One , 7 , e43474 . 57. Andrew , E.J. , Merchan , S. , Lawless , C. , Banks , A.P. , Wilkinson , D.J. and Lydall , D. ( 2013 ) Pentose phosphate pathway function affects tolerance to the G-Quadruplex binder TMPyP4 . PLoS One , 8 , e66242 . 58. Gonzalez-Ramos , D. , van den Broek , M. , van Maris , A.J. , Pronk , J.T. and Daran , J.M. ( 2013 ) Genome-scale analyses of butanol tolerance in Saccharomyces cerevisiae reveal an essential role of protein degradation . Biotechnol. Biofuels , 6 , 1754 - 6834 . 59. Hao ,X.-C., Yang , X.-S. , Wan , P. and Tian , S. ( 2013 ) Comparative proteomic analysis of a new adaptive Pichia Stipitis strain to furfural, a lignocellulosic inhibitory compound . Biotechnol. Biofuels , 6 , 34 . 60. Goldberg , A.L. ( 2003 ) Protein degradation and protection against misfolded or damaged proteins . Nature , 426 , 895 - 899 . 61. Wang , X. , Xu , H. , Ha , S.-W. , Ju , D. and Xie , Y. ( 2010 ) Proteasomal degradation of Rpn4 in Saccharomyces cerevisiae is critical for cell viability under stressed conditions . Genetics , 184 , 335 - 342 . 62. Kahar , P. , Taku , K. and Tanaka , S. ( 2011 ) Enhancement of xylose uptake in 2-deoxyglucose tolerant mutant of Saccharomyces cerevisiae . J. Biosci. Bioeng. , 111 , 557 - 563 . 63. Ouyang , Z. , Song , M. , Gu th ,R., Ha , T.J. , Larouche , M. and Goldowitz , D. ( 2011 ) Conserved and differential gene interactions in dynamical biological systems . Bioinformatics , 27 , 2851 - 2858 . 64. Song , M. , Ouyang , Z. and Liu , Z. ( 2009 ) Discrete dynamical system modelling for gene regulatory networks of 5-hydroxymethylfurfural tolerance for ethanologenic yeast . IET Syst. Biol ., 3 , 203 - 218 . 65. Kim , D. and Hahn , J.-S. ( 2013 ) Roles of the Yap1 transcription factor and antioxidants in Saccharomyces cerevisiae's tolerance to furfural and 5-hydroxymethylfurfural, which function as thiol-reactive electrophiles generating oxidative stress . Appl. Environ. Microbiol ., 79 , 5069 - 5077 . 66. Wade , J.T. , Hall , D.B. and Struhl , K. ( 2004 ) The transcription factor Ifh1 is a key regulator of yeast ribosomal protein genes . Nature , 432 , 1054 - 1058 . 67. Siddiqui , A.H. and Brandriss , M.C. ( 1989 ) The Saccharomyces cerevisiae PUT3 activator protein associates with proline-specific upstream activation sequences . Mol. Cell. Biol ., 9 , 4706 - 4712 . 68. Takagi , H. ( 2008 ) Proline as a stress protectant in yeast: physiological functions, metabolic regulations, and biotechnological applications . Appl. Microbiol. Biotechnol. , 81 , 211 - 223 . 69. dos Santos , S.C. , Tenreiro , S. , Palma , M. , Becker , J. and Sa-Correia , I. ( 2009 ) Transcriptomic profiling of the Saccharomyces cerevisiae response to quinine reveals a glucose limitation response attributable to drug-induced inhibition of glucose uptake . Antimicrob. Agents Chemother ., 53 , 5213 - 5223 . 70. Furuchi , T. , Ishikawa , H. , Miura , N. , Ishizuka , M. , Kajiya , K. , Kuge , S. and Naganuma , A. ( 2001 ) Two nuclear proteins, Cin5 and Ydr259c, confer resistance to cisplatin in Saccharomyces cerevisiae . Mol. Pharmacol ., 59 , 470 - 474 . 71. Spasskaya , D. , Karpov , D. , Mironov , A. and Karpov , V. ( 2014 ) Transcription factor Rpn4 promotes a complex antistress response in Saccharomyces cerevisiae cells exposed to methyl methanesulfonate . Mol. Biol ., 48 , 141 - 149 . 72. Zahringer , H. , Thevelein , J.M. and Nwaka , S. ( 2000 ) Induction of neutral trehalase Nth1 by heat and osmotic stress is controlled by STRE elements and Msn2/Msn4 transcription factors: variations of PKA effect during stress and growth . Mol. Microbiol ., 35 , 397 - 406 . 73. Kwast , K.E. , Lai , L.-C. , Menda , N. , James , D.T. , Aref , S. and Burke , P.V. ( 2002 ) Genomic analyses of anaerobically induced genes in Saccharomyces cerevisiae: functional roles of Rox1 and other factors in mediating the anoxic response . J. Bacteriol. , 184 , 250 - 265 . 74. Zhang , B. , Wang , J. , Wang , X. , Zhu , J. , Liu , Q. , Shi , Z. , Chambers , M.C. , Zimmerman , L.J. , Shaddox , K.F. , Kim , S. et al. ( 2014 ) Proteogenomic characterization of human colon and rectal cancer . Nature , 513 , 382 - 387 . 75. Marguerat , S. , Schmidt , A. , Codlin , S. , Chen , W. , Aebersold , R. and Bahler , J. ( 2012 ) Quantitative analysis of fission yeast transcriptomes and proteomes in proliferating and quiescent cells . Cell , 151 , 671 - 683 . 76. Edgar , R. , Domrachev , M. and Lash , A.E. ( 2002 ) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository . Nucleic Acids Res ., 30 , 207 - 210 .

This is a preview of a remote PDF:

Yang Zhang, Z. Lewis Liu, Mingzhou Song. ChiNet uncovers rewired transcription subnetworks in tolerant yeast for advanced biofuels conversion, Nucleic Acids Research, 2015, 4393-4407, DOI: 10.1093/nar/gkv358