Joint analysis of expression profiles from multiple cancers improves the identification of microRNA–gene interactions

Bioinformatics, Sep 2013

Motivation: MicroRNAs (miRNAs) play a crucial role in tumorigenesis and development through their effects on target genes. The characterization of miRNA–gene interactions will lead to a better understanding of cancer mechanisms. Many computational methods have been developed to infer miRNA targets with/without expression data. Because expression datasets are in general limited in size, most existing methods concatenate datasets from multiple studies to form one aggregated dataset to increase sample size and power. However, such simple aggregation analysis results in identifying miRNA–gene interactions that are mostly common across datasets, whereas specific interactions may be missed by these methods. Recent releases of The Cancer Genome Atlas data provide paired expression profiling of miRNAs and genes in multiple tumors with sufficiently large sample size. To study both common and cancer-specific interactions, it is desirable to develop a method that can jointly analyze multiple cancers to study miRNA–gene interactions without combining all the data into one single dataset. Results: We developed a novel statistical method to jointly analyze expression profiles from multiple cancers to identify miRNA–gene interactions that are both common across cancers and specific to certain cancers. The benefit of this joint analysis approach is demonstrated by both simulation studies and real data analysis of The Cancer Genome Atlas datasets. Compared with simple aggregate analysis or single sample analysis, our method can effectively use the shared information among different but related cancers to improve the identification of miRNA–gene interactions. Another useful property of our method is that it can estimate similarity among cancers through their shared miRNA–gene interactions. Availability and implementation: The program, MCMG, implemented in R is available at http://bioinformatics.med.yale.edu/group/. Contact: hongyu.zhao{at}yale.edu

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://bioinformatics.oxfordjournals.org/content/29/17/2137.full.pdf

Joint analysis of expression profiles from multiple cancers improves the identification of microRNA–gene interactions

Advance Access publication June Joint analysis of expression profiles from multiple cancers improves the identification of microRNA-gene interactions Xiaowei Chen 2 Frank J. Slack 1 Hongyu Zhao 0 2 3 Associate Editor: Ivo Hofacker 0 Department of Biostatistics, Yale School of Public Health 1 Department of Molecular, Cellular and Developmental Biology, Yale University 2 Program in Computational Biology and Bioinformatics, Yale University 3 Department of Genetics, Yale School of Medicine , New Haven, CT 06511 , USA Motivation: MicroRNAs (miRNAs) play a crucial role in tumorigenesis and development through their effects on target genes. The characterization of miRNA-gene interactions will lead to a better understanding of cancer mechanisms. Many computational methods have been developed to infer miRNA targets with/without expression data. Because expression datasets are in general limited in size, most existing methods concatenate datasets from multiple studies to form one aggregated dataset to increase sample size and power. However, such simple aggregation analysis results in identifying miRNA-gene interactions that are mostly common across datasets, whereas specific interactions may be missed by these methods. Recent releases of The Cancer Genome Atlas data provide paired expression profiling of miRNAs and genes in multiple tumors with sufficiently large sample size. To study both common and cancer-specific interactions, it is desirable to develop a method that can jointly analyze multiple cancers to study miRNA-gene interactions without combining all the data into one single dataset. Results: We developed a novel statistical method to jointly analyze expression profiles from multiple cancers to identify miRNA-gene interactions that are both common across cancers and specific to certain cancers. The benefit of this joint analysis approach is demonstrated by both simulation studies and real data analysis of The Cancer Genome Atlas datasets. Compared with simple aggregate analysis or single sample analysis, our method can effectively use the shared information among different but related cancers to improve the identification of miRNA-gene interactions. Another useful property of our method is that it can estimate similarity among cancers through their shared miRNA-gene interactions. Availability and implementation: The program, MCMG, implemented in R is available at http://bioinformatics.med.yale.edu/group/. Contact: The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: 1 INTRODUCTION MicroRNAs (miRNAs) ( 22 nt) are important non-coding small RNAs regulating gene expression by repressing the *To whom correspondence should be addressed. translation or degrading target genes through complementary base pairing to 30 untranslated regions (30 UTRs) of genes (Bartel, 2004). They are involved in many cancer-related processes, such as cell growth and differentiation, through regulating their target gene expression (Esquela-Kerscher and Slack, 2006). Considering the importance of miRNAs in cancers and that they regulate a large number of genes, deciphering miRNA and gene interactions at the genome level can lead to a better understanding of tumorigenesis and development. In recent years, many computational approaches have been developed to predict miRNA targets. Sequence-based prediction algorithms build on specific binding rules, including sequence complementarity, secondary structure, energy, conservation and site accessibility, to predict miRNA–gene interactions. Some representative methods include TargetScanS/TargetScan (Lewis et al., 2003, 2005), miRanda (Enright et al., 2003) and PicTar (Krek et al., 2005). Although these methods provide a list of potential target genes for each miRNA, they suffer from a relatively high false-positive rate because of the complex nature of miRNA–gene interactions (Sethupathy et al., 2006). In addition, the predictions are static and may not capture those interactions that are specific to certain diseases or conditions. To improve sequence-based prediction specificities and identify condition-specific interactions, efforts have been made to incorporate expression profiles to study miRNA regulatory mechanisms. The basic principle of these methods is that genes regulated by a miRNA should exhibit negative expression correlations with the miRNA. These methods include those based on simple correlation analysis (Liu et al., 2010; Van der Auwera et al., 2010), simple/regularized regression models (Kim et al., 2009; Lu et al., 2011; Muniategui et al., 2012a) and Bayesian inference (Huang et al., 2007; Su et al., 2011). Pearson correlation in the category of simple correlation analysis is the most straightforward way to study miRNA–gene interactions. However, the simplicity of this method usually results in relatively high false-positive results. Lasso regression (Lu et al., 2011; Muniategui et al., 2012a) in the category of reg (...truncated)


This is a preview of a remote PDF: https://bioinformatics.oxfordjournals.org/content/29/17/2137.full.pdf

Xiaowei Chen, Frank J. Slack, Hongyu Zhao. Joint analysis of expression profiles from multiple cancers improves the identification of microRNA–gene interactions, Bioinformatics, 2013, pp. 2137-2145, 29/17, DOI: 10.1093/bioinformatics/btt341