RAIN: RNA–protein Association and Interaction Networks

Jan 2017

Protein association networks can be inferred from a range of resources including experimental data, literature mining and computational predictions. These types of evidence are emerging for non-coding RNAs (ncRNAs) as well. However, integration of ncRNAs into protein association networks is challenging due to data heterogeneity. Here, we present a database of ncRNA–RNA and ncRNA–protein interactions and its integration with the STRING database of protein–protein interactions. These ncRNA associations cover four organisms and have been established from curated examples, experimental data, interaction predictions and automatic literature mining. RAIN uses an integrative scoring scheme to assign a confidence score to each interaction. We demonstrate that RAIN outperforms the underlying microRNA-target predictions in inferring ncRNA interactions. RAIN can be operated through an easily accessible web interface and all interaction data can be downloaded. Database URL: http://rth.dk/resources/rain

Article PDF cannot be displayed. You can download it here:

https://database.oxfordjournals.org/content/2017/baw167.full.pdf

RAIN: RNA–protein Association and Interaction Networks

Database, 2017, 1–9 doi: 10.1093/database/baw167 Original article Original article RAIN: RNA–protein Association and Interaction Networks Alexander Junge1,2,†, Jan C. Refsgaard3,†, Christian Garde1,4,†, Xiaoyong Pan1,2,3, Alberto Santos3, Ferhat Alkan1,2, Christian Anthon1,2, Christian von Mering5, Christopher T. Workman1,4, Lars Juhl Jensen1,3,* and Jan Gorodkin1,2,* 1 Center for Non-coding RNA in Technology and Health, University of Copenhagen, Copenhagen, , Groennegaardsvej 3, DK-1870 Frederiksberg C, Denmark, 2Department of Veterinary Clinical and Animal Sciences, University of Copenhagen, Groennegaardsvej 3, DK-1870 Frederiksberg C, Denmark, 3 Disease Systems Biology Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Building: 06-2-26, Blegdamsvej 3B, DK-2200 Copenhagen N, Denmark, 4Center for Biological Sequence Analysis, Technical University of Denmark, Kemitorvet, Building 208, DK-2800 Lyngby, Denmark, 5Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland *Corresponding author: Tel: þ45 3533 3578; Fax: þ45 3533 3042 Correspondence may also be addressed to Jan Gorodkin. Tel: +45 35 32 50 25 Email: Present address: Christian Garde, The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Building 6.6 Blegdamsvej 3B, 2200 Copenhagen N, Copenhagen, Denmark † These authors contributed equally to this work. Citation details: Junge,A., Refsgaard,J.C., Garde,C. et al. RAIN: RNA–protein association and interaction networks. Database (2016) Vol. 2016: article ID baw167; doi:10.1093/database/baw100 Revised 18 November 2016; Accepted 5 December 2016 Abstract Protein association networks can be inferred from a range of resources including experimental data, literature mining and computational predictions. These types of evidence are emerging for non-coding RNAs (ncRNAs) as well. However, integration of ncRNAs into protein association networks is challenging due to data heterogeneity. Here, we present a database of ncRNA–RNA and ncRNA–protein interactions and its integration with the STRING database of protein–protein interactions. These ncRNA associations cover four organisms and have been established from curated examples, experimental data, interaction predictions and automatic literature mining. RAIN uses an integrative scoring scheme to assign a confidence score to each interaction. We demonstrate that RAIN outperforms the underlying microRNA-target predictions in inferring ncRNA interactions. RAIN can be operated through an easily accessible web interface and all interaction data can be downloaded. Database URL: http://rth.dk/resources/rain C The Author(s) 2017. Published by Oxford University Press. V Page 1 of 9 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact (page number not for citation purposes) Page 2 of 9 Introduction set of sources and covers four organisms: human (Homo sapiens), mouse (Mus musculus), rat (Rattus norvegicus) and baker’s yeast (Saccharomyces cerevisiae). RAIN scores the reliability of each interaction using a scoring scheme based on the comparison to a curated set of interactions. It finally integrates ncRNA–RNA and ncRNA–protein associations with protein–protein associations contained in the STRING database. This enables researchers to explore complex interaction networks in the powerful, yet intuitive interactive STRING user interface. Materials and Methods Sources of evidence We established four channels of evidence to support the interactions found in RAIN, namely, (i) curated knowledge, (ii) experimental evidence, (iii) miRNA target predictions and (iv) automated literature mining, see Figure 1. Each of the four evidence channels is generated by integrating a number of underlying resources. (i) Curated knowledge. This comprises 867 human molecular interactions that are well established in the scientific literature and/or listed in expert curated databases. The interactions were collected for nine classes of ncRNAs, namely microRNA (miRNA) (3), ribosomal RNA (rRNA) (10), transfer RNA (tRNA) (11), signal recognition particle RNA (SRP RNA) (12), Vault RNA (13– 15), Y RNA (16–18), Telomerase RNA (19), small nucleolar RNA (snoRNA) (20) and spliceosomal RNA (U1, U2, U4, U4atac, U6, U6atac, U11, U12) (20). For further Figure 1. Flow chart illustrating the development of the RAIN database, ranging from establishing scoring schemes for the individual sources of evidence, through integration of resources to evidence channels, to finally defining functional molecular networks. The study of protein-coding genes and the accumulation of data from expression studies and other complementary methods have helped researchers to generate protein association networks compiled in resources such as the STRING database (1). Using a probabilistic scoring scheme, STRING assigns a score to each physical interaction and functional association (henceforth referred to as interactions). The recent version 10 holds interactions for >2000 organisms. However, interaction networks containing only proteins and their interactions remain incomplete until other important molecular interactions have been included. For this reason, we have focused on complementing protein interaction networks with non-coding RNAs (ncRNAs)—a large class of genes comprising 16 000 long and 10 000 short ncRNAs in human [GENCODE version 24 (2)]. Integration of these interactions allows for an analysis of the complex functional interplay of ncRNA–RNA and ncRNA–protein interactions. Data on such interactions, complemented by co-expression and literature mining, are currently emerging (3–5). This led to the generation of databases storing ncRNA interactions such as miRTarBase (6) and TarBase (7) containing microRNA (miRNA)–target interactions. NPInter (5), RAID (8) and StarBase (9) are examples of databases collecting interactions between ncRNAs and proteins. The analysis of ncRNA interactions is challenged by issues related to data heterogeneity, such as varying quality as well as the usage of different identifiers and interaction scoring schemes. The STRING database, used by thousands of researchers daily, has addressed these challenges for proteins through the use of unified identifiers and calibrated scoring schemes (1). A resource similar to STRING is not available for ncRNAs and their interactions. Similar to protein interactions, ncRNA interactions are supported by diverse sources of evidence such as expert curation, experiments, text mining and predictions. In order to compare these so (...truncated)


This is a preview of a remote PDF: https://database.oxfordjournals.org/content/2017/baw167.full.pdf
Article home page: http://database.oxfordjournals.org/content/2017/baw167.abstract

Alexander Junge, Jan C. Refsgaard, Christian Garde, Xiaoyong Pan, Alberto Santos, Ferhat Alkan, Christian Anthon, Christian von Mering, Christopher T. Workman, Lars Juhl Jensen, Jan Gorodkin. RAIN: RNA–protein Association and Interaction Networks, 2017, 2017, DOI: 10.1093/database/baw167