Genome annotation improvements from cross-phyla proteogenomics and time-of-day differences in malaria mosquito proteins using untargeted quantitative proteomics
RESEARCH ARTICLE
Genome annotation improvements from
cross-phyla proteogenomics and time-of-day
differences in malaria mosquito proteins
using untargeted quantitative proteomics
Lisa Imrie1☯, Thierry Le Bihan1,2,3☯, Áine O’Toole4, Paul V. Hickner5, W. Augustine Dunn6,
Benjamin Weise2, Samuel S. C. Rund ID2,5*
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
1 SynthSys–Synthetic and Systems Biology, School of Biological Sciences, University of Edinburgh,
Edinburgh, United Kingdom, 2 Centre for Immunity, Infection and Evolution, University of Edinburgh,
Edinburgh, United Kingdom, 3 Rapid Novor, Kitchener, Ontario, Canada, 4 Institute of Evolutionary Biology,
University of Edinburgh, Edinburgh, United Kingdom, 5 Eck Institute for Global Health, University of Notre
Dame, Notre Dame, Indiana, United States of America, 6 Boston Children’s Hospital, Boston,
Massachusetts, United States of America
☯ These authors contributed equally to this work.
*
OPEN ACCESS
Citation: Imrie L, Le Bihan T, O’Toole Á, Hickner
PV, Dunn WA, Weise B, et al. (2019) Genome
annotation improvements from cross-phyla
proteogenomics and time-of-day differences in
malaria mosquito proteins using untargeted
quantitative proteomics. PLoS ONE 14(7):
e0220225. https://doi.org/10.1371/journal.
pone.0220225
Editor: Walter S. Leal, University of CaliforniaDavis, UNITED STATES
Received: October 24, 2018
Accepted: July 11, 2019
Published: July 29, 2019
Copyright: © 2019 Imrie et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: All relevant data are
within the manuscript and its Supporting
Information files with the exception of the raw
mass spectroscopy files which can be found on
DataDryad at https://doi.org/10.5061/dryad.
8p20m31.
Funding: Samuel S.C. Rund was funded by a Royal
Society Newton International Fellowship
(NF140517) and a strategic award from the
Abstract
The malaria mosquito, Anopheles stephensi, and other mosquitoes modulate their biology to
match the time-of-day. In the present work, we used a non-hypothesis driven approach (untargeted proteomics) to identify proteins in mosquito tissue, and then quantified the relative
abundance of the identified proteins from An. stephensi bodies. Using these quantified protein
levels, we then analyzed the data for proteins that were only detectable at certain times-of-the
day, highlighting the need to consider time-of-day in experimental design. Further, we
extended our time-of-day analysis to look for proteins which cycle in a rhythmic 24-hour (“circadian”) manner, identifying 31 rhythmic proteins. Finally, to maximize the utility of our data,
we performed a proteogenomic analysis to improve the genome annotation of An. stephensi.
We compare peptides that were detected using mass spectrometry but are ‘missing’ from the
An. stephensi predicted proteome, to reference proteomes from 38 other primarily human disease vector species. We found 239 such peptide matches and reveal that genome annotation
can be improved using proteogenomic analysis from taxonomically diverse reference proteomes. Examination of ‘missing’ peptides revealed reading frame errors, errors in gene-calling,
overlapping gene models, and suspected gaps in the genome assembly.
Introduction
Anopheles stephensi is a major malaria vector in southern Asia where its geographic range
extends across the Indian subcontinent [1]. Research on the African Anopheles gambiae mosquito has demonstrated that the behavior and physiology of the mosquito is highly dependent
on circadian biology and time-of-day. For example, ~20% of An. gambiae genes were
PLOS ONE | https://doi.org/10.1371/journal.pone.0220225 July 29, 2019
1 / 14
Time-of-day mosquito untargeted quantitative proteomics
Wellcome Trust (No. 095831) for the Centre for
Immunity, Infection and Evolution. Áine O’ Toole
was funded by the Wellcome Trust (202769/Z/16/
Z; PhD programme in Hosts, Pathogens and Global
348 422 Health). The LC-MS QExactive equipment
was purchased by a Wellcome Trust Institutional
Strategic Support Fund and a strategic award from
the Wellcome Trust for the Centre for Immunity,
Infection and Evolution (095831/Z/11/Z). Rapid
Novor provided support in the form of a salary for
author TLB, but did not have any additional role in
the study design, data collection and analysis,
decision to publish, or preparation of the
manuscript. The remaining funders also had no
role in study design, data collection and analysis,
decision to publish, or preparation of the
manuscript. The funders had no role in study
design, data collection and analysis, decision to
publish, or preparation of the manuscript.
Competing interests: TLB has received salary from
Rapid Novor. TLB’s employment at Rapid Novor
does not alter our adherence to PLOS ONE policies
on sharing data and materials. The other authors
have declared that no competing interests exist.
The specific roles of all authors are articulated in
the ‘author contributions’ section.
rhythmically expressed over the 24-hour day [2]; rhythmically expressed mosquito olfaction
genes correspond with rhythmic proteins levels and time-of-day changes in electrophysiological sensitivity to host odorants [3]; and time-of-day effects are associated with mosquito insecticidal resistance[4]. An. stephensi has been demonstrated to have 24-hour nocturnal rhythms
of flight behavior that persists even in the absence of light:dark cues [5]. Finally, rhythms in
the biology of the mosquito, and indeed possibly in the human host and plasmodium parasite,
may interact to affect disease transmission [6–8].
To date, the genomes of two strains of An. stephensi have been sequenced, one from
India and one from Pakistan (SDA-500) [9, 10]. To our knowledge, proteomics in this species is limited to an Edman degradation of their salivary glands [11]; mass spectrometry
proteomics analysis of salivary proteomes [11, 12]; fat bodies [13, 14]; midguts/fat bodies
[14]; a mass spectrometry proteomics analysis of ageing in the head and thorax [15]; and a
recent work across multiple tissues which included genome annotation improvements
[16].
In An. gambiae, several mass spectrometry-based studies have been performed on various
tissues, including the antennae, head, body, midgut peritrophic matrix, salivary glands, and
cuticle [3, 17–20]. Proteomic experiments can be used to identify post-translational modification, improve genome annotation, and to identify and quantify proteins in a biological sample
[16, 21, 22].
A previous study in An. gambiae mosquito antennae utilized targeted quantitative proteomics, in which the mass spectrometer was tuned to specifically identify and quantify the protein abundance of proteins from an a priori list of genes of interest [3] where only targeted
proteins are in (...truncated)