Principles of self-organization in biological pathways: a hypothesis on the autogenous association of alpha-synuclein
Andreas Zanzoni
0
1
2
Domenica Marchese
0
1
2
Federico Agostini
0
1
2
Benedetta Bolognesi
0
1
2
Davide Cirillo
0
1
2
Maria Botta-Orfila
0
1
2
Carmen Maria Livi
0
1
2
Silvia Rodriguez-Mulero
0
1
2
Gian Gaetano Tartaglia
0
1
2
0
Universitat Pompeu Fabra (UPF)
,
08003 Barcelona, Spain
1
Present address: Andreas Zanzoni, Inserm TAGC U1090,
Aix-Marseille Universite
, Parc Scientifique de Luminy,
13288 Marseille, France
2
Gene Function and Evolution,
Bioinformatics and Genomics, Centre for Genomic Regulation (CRG)
,
08003 Barcelona, Spain
Previous evidence indicates that a number of proteins are able to interact with cognate mRNAs. These autogenous associations represent important regulatory mechanisms that control gene expression at the translational level. Using the catRAPID approach to predict the propensity of proteins to bind to RNA, we investigated the occurrence of autogenous associations in the human proteome. Our algorithm correctly identified binding sites in wellknown cases such as thymidylate synthase, tumor suppressor P53, synaptotagmin-1, serine/arigininerich splicing factor 2, heat shock 70 kDa, ribonucleic particle-specific U1A and ribosomal protein S13. In addition, we found that several other proteins are able to bind to their own mRNAs. A large-scale analysis of biological pathways revealed that aggregation-prone and structurally disordered proteins have the highest propensity to interact with cognate RNAs. These findings are substantiated by experimental evidence on amyloidogenic proteins such as TAR DNA-binding protein 43 and fragile X mental retardation protein. Among the amyloidogenic proteins, we predicted that Parkinson's disease-related a-synuclein is highly prone to interact with cognate transcripts, which suggests the existence of RNA-dependent factors in its function and dysfunction. Indeed, as aggregation is intrinsically concentration dependent, it is possible that autogenous interactions play a crucial role in controlling protein homeostasis.
-
Although proteins are involved in almost every cellular
process, increasing evidence indicates that coding and
non-coding RNAs play fundamental roles in gene
regulation (1,2) and disease (3,4). Recent studies showed that
establishment of aberrant associations or disruption of
functional proteinRNA interactions occurs in
neurological disorders (5,6). For instance, interaction with
RNA favors conversion of alpha-helix rich prion protein
PrPC into the pathogenic beta-structure-rich insoluble
conformer PrPSc that propagates in CreutzfeldtJakob
disease (7). In Alzheimers disease, the association
between Amyloid Precursor Protein mRNA and
iron regulatory protein 1 is disrupted, resulting in
compromised translation efEciency and elevated
cytotoxicity (8).
ProteinRNA associations regulate several processes
such as synthesis, folding, translocation, assembly and
clearance of molecules. Previous studies suggested that
ribonucleoprotein interactions might be able to facilitate
protein and RNA folding (9,10). As a matter of fact, it has
been observed that there is strong affinity between amino
acids and their corresponding codons (11,12), which could
imply a direct interaction between proteins and their own
mRNAs (13,14). Indeed, TAR DNA-binding protein 43
(TDP-43) and Fragile X Mental Retardation protein
(FMRP) have been found to interact with their own
mRNAs (15,16). In these cases, expression is regulated
by a negative feedback loop involving the 30 untranslated
region (UTR). Other autogenous associations have been
observed in proteins associated with cell proliferation and
gene expression (17,18). Also structurally disordered
proteins such as Serine/Arginine-rich splicing factor 2
(SRSF2) (19) as well as heterogeneous ribonucleoprotein
members (20,21) are able to inhibit their translation by
associating with their own mRNAs.
How often do autogenous associations occur in the
human proteome? Recent technological advances
revealed that a large number of proteins have
RNAbinding abilities (22), which suggests that interaction
with cognate mRNAs could be more frequent than
previously thought. Are autogenous associations linked to
specific functions? It is possible that autoregulatory
mechanisms are involved in controlling protein production. For
instance, in the case of TDP-43 and FMRP, inhibition of
expression via autogenous interaction is a way to preserve
protein functionality (15,16). Overexpression leads to high
protein production and enhanced amyloidogenicity,
resulting in harmful gain- or loss-of-function effects on
cellular metabolism (23).
In this work, we focused on the ability of proteins to
establish autogenous associations. Using our
computational approach catRAPID (24), we studied the
occurrence of these interactions in the human proteome. A
large-scale analysis was performed to identify the role of
autogenous associations in biological pathways and
characterize their properties.
MATERIALS AND METHODS
Biological pathway annotations
We downloaded (September 2012) pathway data from two
manually curated and high-quality resources: Reactome
(25) and the NCI Pathway Interaction Database
(NCIPID) (26). The Reactome annotations (version 41) were
gathered via the BioMart query interface returning a
list of 167 canonical pathways containing 5375 unique
protein coding genes, whereas the NCI-PID pathways
were fetched directly from the database website (241
pathways, 2053 unique protein coding genes). In both
cases, UniprotKB (27) accession numbers were
converted to Ensembl (version 68) gene identifiers using the
UniprotKB id-mapping file (version 2012_08).
Subsequently, the gene pathway annotations were
transferred to the corresponding polypeptides and
coding/non-coding transcripts.
ProteinRNA interaction prediction
We used the catRAPID algorithm (24) to predict
interaction propensity among all peptides and transcripts
belonging to Reactome and NCI-PID pathways. catRAPID
was trained on a large set of proteinRNA pairs available
in the Protein Data Bank to discriminate interacting and
non-interacting molecules using secondary structure
propensities, hydrogen bonding and van der Waals
contributions (28). The method was tested on the non-nucleic
acid-binding database (NNBP; area under the receiver
operating characteristic curve of 0.92), the NPInter
database (area under the receiver operating characteristic
curve of 0.88) and a number of individual interactions
(e.g. RNAse mitochondrial RNA MRP and X-inactive
specific transcript XIST networks; average accuracy of
78%). Owing to CPU limitations in the calculation (29),
we restricted the predictions to RNA sequences with a
length between 50 and 1500 nt as well as to polypeptides
with a length between 50 and 750 amino acids. The
fragment and strength algorithms were used to
identify regions involved in the binding and compute the
specificity with respect to random proteinRNA
associations (5,29). For each proteinRNA pair under
investigation, a reference set of (...truncated)