An Interaction Network Predicted from Public Data as a Discovery Tool: Application to the Hsp90 Molecular Chaperone Machine
Picard D (2011) An Interaction Network Predicted from Public Data as a Discovery Tool: Application to
the Hsp90 Molecular Chaperone Machine. PLoS ONE 6(10): e26044. doi:10.1371/journal.pone.0026044
An Interaction Network Predicted from Public Data as a Discovery Tool: Application to the Hsp90 Molecular Chaperone Machine
Pablo C. Echeverra 0
Andreas Bernthaler 0
Pierre Dupuis 0
Bernd Mayer 0
Didier Picard 0
Sue Cotterill, St. Georges University of London, United Kingdom
0 1 De partement de Biologie Cellulaire, Universite de Gene`ve, Gene`ve, Switzerland , 2 emergentec biodevelopment GmbH, Wien , Austria
Understanding the functions of proteins requires information about their protein-protein interactions (PPI). The collective effort of the scientific community generates far more data on any given protein than individual experimental approaches. The latter are often too limited to reveal an interactome comprehensively. We developed a workflow for parallel mining of all major PPI databases, containing data from several model organisms, and to integrate data from the literature for a protein of interest. We applied this novel approach to build the PPI network of the human Hsp90 molecular chaperone machine (Hsp90Int) for which previous efforts have yielded limited and poorly overlapping sets of interactors. We demonstrate the power of the Hsp90Int database as a discovery tool by validating the prediction that the Hsp90 cochaperone Aha1 is involved in nucleocytoplasmic transport. Thus, we both describe how to build a custom database and introduce a powerful new resource for the scientific community.
-
The comprehensive determination of the interactome of a protein
of interest (POI) is technically challenging and in many cases
impossible, even though it is ultimately indispensable to understand
its functions. While there may often not be one correct way of
screening for interactors of a POI, there is already a huge amount of
data on protein-protein interactions in general from large-scale screens
performed with different techniques and species. Thus, mining public
databases in addition to extracting relevant information from the
literature may be a more efficient approach to building a POI
interactome that is reasonably reliable to serve as a discovery tool.
There is clearly a need to develop a workflow to extract the data that
are available for a POI but scattered across multiple databases and the
scientific literature into a single virtual interactome.
Hsp90 is a highly abundant and conserved molecular chaperone
that exists both in prokaryotes and in eukaryotes. The cytosolic
isoforms, known for example as Hsp90a and Hsp90b in humans,
are essential and have been most extensively studied [13].
Although Hsp90 has an intrinsic ATPase activity that drives its
conformational changes, it really functions as a multicomponent
molecular machine. A large cohort of cofactors, referred to as
cochaperones in this context, modulate many aspects of this machine
including ATPase activity, recognition and selectivity, binding and
release of substrates [4]. It has been speculated that the Hsp90
chaperone machine may assist up to 10% of all cytosolic proteins
at some stage of their life cycle [5], but how it recognizes its
substrates and, in most cases, what it does to them remain very
poorly understood. Most likely because of the central role of
Hsp90 in many cellular processes, cancer cells, pathogens, and
viruses may be particularly dependent on it. This has led to a great
interest in developing specific Hsp90 inhibitors, of which several
are now in clinical trials for the treatment of cancer [6,7].
Identifying the proteins that interact with Hsp90, either as
regulators or co-chaperones or substrates (clients) is essential to
understand the global functions of this essential molecular
machine. A variety of biochemical and genetic efforts have been
undertaken to define molecular chaperone networks more
generally (for example, refs. [8,911]) and the Hsp90 interactome
specifically [1222]. However, for Hsp90, the overlap between
their respective hits and the number of known false negatives have
been rather frustrating, most likely owing to the transient nature of
many of these interactions. Further discussions of these issues and
of the available approaches can be found in a very recent review
[23]. Since standard proteomic or genomic approaches for this
molecular chaperone machine are unable to capture the
interactome comprehensively, the application of our new workflow
to it appeared particularly appropriate. The specific result is a
powerful discovery tool that will serve any scientific community
whose paths may cross Hsp90.
Construction of Hsp90Int
A step-by-step protocol and scripts for building a PPI network
for ones own POI(s) are provided in File S1. As indicated in the
text, PPI data was retrieved and edited from public databases and
the literature. For each of the seven model organisms, the data
were stored in tab-delimited text files. For each pair of interacting
proteins, these files contain the information about the source
database, the experimental system employed to determine the
interaction, and the corresponding PubMed reference(s) where
available. All the PPI information contained in our text files was
subjected to further processing and dynamic manipulation by
conversion into visualizable PPI networks using Cytoscape 2.6.3
with a spring-embedded layout [24]. Proteins in the query list were
identified and selected in each network. To detect and to extract
the first level of interactors of the query list as well as interactions
between these neighbors, we used Cytoscape tools Select first
neighbors of selected nodes and New network.from selected
nodes, all edges.
Each species-specific network was filtered in order to eliminate
PPIs already described in humans by intersecting it with the
human network using the Cytoscape intersection feature from the
Merge networks tool. After converting the species-specific PPI
networks into human interolog networks, we used the Cytoscape
tool Advanced network merge to merge them into a unique
network (Hsp90Int).
Note that whenever the available data did not specify which of
the two cytosolic Hsp90 isoforms, Hsp90a or b, was meant, we
arbitrarily assumed it was both. In general, the current datasets are
too incomplete to allow a meaningful inference of isoform-specific
interactomes.
Graph measures and data evaluation
A PPI can be represented as a graph where proteins represent
nodes (or vertices) and interactions represent edges. Therefore, we
describe the base network of the seven model organisms as
GB = (VB,EB) and the network from Hsp90Int as GH = (EH,VH).
For GH we calculated the graph measures mean degree, diameter,
index of aggregation, connectivity, clustering coefficient, and
assortative mixing coefficient. The graph measures were calculated
with previously reported formulas [25] with partial incorporation
into the JUNG (...truncated)