An Interaction Network Predicted from Public Data as a Discovery Tool: Application to the Hsp90 Molecular Chaperone Machine

PLOS ONE, Oct 2011

Understanding the functions of proteins requires information about their protein-protein interactions (PPI). The collective effort of the scientific community generates far more data on any given protein than individual experimental approaches. The latter are often too limited to reveal an interactome comprehensively. We developed a workflow for parallel mining of all major PPI databases, containing data from several model organisms, and to integrate data from the literature for a protein of interest. We applied this novel approach to build the PPI network of the human Hsp90 molecular chaperone machine (Hsp90Int) for which previous efforts have yielded limited and poorly overlapping sets of interactors. We demonstrate the power of the Hsp90Int database as a discovery tool by validating the prediction that the Hsp90 co-chaperone Aha1 is involved in nucleocytoplasmic transport. Thus, we both describe how to build a custom database and introduce a powerful new resource for the scientific community.

An Interaction Network Predicted from Public Data as a Discovery Tool: Application to the Hsp90 Molecular Chaperone Machine

Picard D (2011) An Interaction Network Predicted from Public Data as a Discovery Tool: Application to the Hsp90 Molecular Chaperone Machine. PLoS ONE 6(10): e26044. doi:10.1371/journal.pone.0026044 An Interaction Network Predicted from Public Data as a Discovery Tool: Application to the Hsp90 Molecular Chaperone Machine Pablo C. Echeverra 0 Andreas Bernthaler 0 Pierre Dupuis 0 Bernd Mayer 0 Didier Picard 0 Sue Cotterill, St. Georges University of London, United Kingdom 0 1 De partement de Biologie Cellulaire, Universite de Gene`ve, Gene`ve, Switzerland , 2 emergentec biodevelopment GmbH, Wien , Austria Understanding the functions of proteins requires information about their protein-protein interactions (PPI). The collective effort of the scientific community generates far more data on any given protein than individual experimental approaches. The latter are often too limited to reveal an interactome comprehensively. We developed a workflow for parallel mining of all major PPI databases, containing data from several model organisms, and to integrate data from the literature for a protein of interest. We applied this novel approach to build the PPI network of the human Hsp90 molecular chaperone machine (Hsp90Int) for which previous efforts have yielded limited and poorly overlapping sets of interactors. We demonstrate the power of the Hsp90Int database as a discovery tool by validating the prediction that the Hsp90 cochaperone Aha1 is involved in nucleocytoplasmic transport. Thus, we both describe how to build a custom database and introduce a powerful new resource for the scientific community. - The comprehensive determination of the interactome of a protein of interest (POI) is technically challenging and in many cases impossible, even though it is ultimately indispensable to understand its functions. While there may often not be one correct way of screening for interactors of a POI, there is already a huge amount of data on protein-protein interactions in general from large-scale screens performed with different techniques and species. Thus, mining public databases in addition to extracting relevant information from the literature may be a more efficient approach to building a POI interactome that is reasonably reliable to serve as a discovery tool. There is clearly a need to develop a workflow to extract the data that are available for a POI but scattered across multiple databases and the scientific literature into a single virtual interactome. Hsp90 is a highly abundant and conserved molecular chaperone that exists both in prokaryotes and in eukaryotes. The cytosolic isoforms, known for example as Hsp90a and Hsp90b in humans, are essential and have been most extensively studied [13]. Although Hsp90 has an intrinsic ATPase activity that drives its conformational changes, it really functions as a multicomponent molecular machine. A large cohort of cofactors, referred to as cochaperones in this context, modulate many aspects of this machine including ATPase activity, recognition and selectivity, binding and release of substrates [4]. It has been speculated that the Hsp90 chaperone machine may assist up to 10% of all cytosolic proteins at some stage of their life cycle [5], but how it recognizes its substrates and, in most cases, what it does to them remain very poorly understood. Most likely because of the central role of Hsp90 in many cellular processes, cancer cells, pathogens, and viruses may be particularly dependent on it. This has led to a great interest in developing specific Hsp90 inhibitors, of which several are now in clinical trials for the treatment of cancer [6,7]. Identifying the proteins that interact with Hsp90, either as regulators or co-chaperones or substrates (clients) is essential to understand the global functions of this essential molecular machine. A variety of biochemical and genetic efforts have been undertaken to define molecular chaperone networks more generally (for example, refs. [8,911]) and the Hsp90 interactome specifically [1222]. However, for Hsp90, the overlap between their respective hits and the number of known false negatives have been rather frustrating, most likely owing to the transient nature of many of these interactions. Further discussions of these issues and of the available approaches can be found in a very recent review [23]. Since standard proteomic or genomic approaches for this molecular chaperone machine are unable to capture the interactome comprehensively, the application of our new workflow to it appeared particularly appropriate. The specific result is a powerful discovery tool that will serve any scientific community whose paths may cross Hsp90. Construction of Hsp90Int A step-by-step protocol and scripts for building a PPI network for ones own POI(s) are provided in File S1. As indicated in the text, PPI data was retrieved and edited from public databases and the literature. For each of the seven model organisms, the data were stored in tab-delimited text files. For each pair of interacting proteins, these files contain the information about the source database, the experimental system employed to determine the interaction, and the corresponding PubMed reference(s) where available. All the PPI information contained in our text files was subjected to further processing and dynamic manipulation by conversion into visualizable PPI networks using Cytoscape 2.6.3 with a spring-embedded layout [24]. Proteins in the query list were identified and selected in each network. To detect and to extract the first level of interactors of the query list as well as interactions between these neighbors, we used Cytoscape tools Select first neighbors of selected nodes and New network.from selected nodes, all edges. Each species-specific network was filtered in order to eliminate PPIs already described in humans by intersecting it with the human network using the Cytoscape intersection feature from the Merge networks tool. After converting the species-specific PPI networks into human interolog networks, we used the Cytoscape tool Advanced network merge to merge them into a unique network (Hsp90Int). Note that whenever the available data did not specify which of the two cytosolic Hsp90 isoforms, Hsp90a or b, was meant, we arbitrarily assumed it was both. In general, the current datasets are too incomplete to allow a meaningful inference of isoform-specific interactomes. Graph measures and data evaluation A PPI can be represented as a graph where proteins represent nodes (or vertices) and interactions represent edges. Therefore, we describe the base network of the seven model organisms as GB = (VB,EB) and the network from Hsp90Int as GH = (EH,VH). For GH we calculated the graph measures mean degree, diameter, index of aggregation, connectivity, clustering coefficient, and assortative mixing coefficient. The graph measures were calculated with previously reported formulas [25] with partial incorporation into the JUNG (...truncated)


This is a preview of a remote PDF: http://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0026044&type=printable
Article home page: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0026044

Pablo C. Echeverría, Andreas Bernthaler, Pierre Dupuis, Bernd Mayer, Didier Picard. An Interaction Network Predicted from Public Data as a Discovery Tool: Application to the Hsp90 Molecular Chaperone Machine, PLOS ONE, 2011, 10, DOI: 10.1371/journal.pone.0026044