Jenner-predict server: prediction of protein vaccine candidates (PVCs) in bacteria based on host-pathogen interactions
BMC Bioinformatics
Jenner-predict server: prediction of protein vaccine candidates (PVCs) in bacteria based on host-pathogen interactions
Varun Jaiswal 0
Sree Krishna Chanumolu 0
Ankit Gupta 0
Rajinder S Chauhan 0
Chittaranjan Rout 0
0 Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology , Waknaghat, Solan, Himachal Pradesh 173234 , India
Background: Subunit vaccines based on recombinant proteins have been effective in preventing infectious diseases and are expected to meet the demands of future vaccine development. Computational approach, especially reverse vaccinology (RV) method has enormous potential for identification of protein vaccine candidates (PVCs) from a proteome. The existing protective antigen prediction software and web servers have low prediction accuracy leading to limited applications for vaccine development. Besides machine learning techniques, those software and web servers have considered only protein's adhesin-likeliness as criterion for identification of PVCs. Several non-adhesin functional classes of proteins involved in host-pathogen interactions and pathogenesis are known to provide protection against bacterial infections. Therefore, knowledge of bacterial pathogenesis has potential to identify PVCs. Results: A web server, Jenner-Predict, has been developed for prediction of PVCs from proteomes of bacterial pathogens. The web server targets host-pathogen interactions and pathogenesis by considering known functional domains from protein classes such as adhesin, virulence, invasin, porin, flagellin, colonization, toxin, choline-binding, penicillin-binding, transferring-binding, fibronectin-binding and solute-binding. It predicts non-cytosolic proteins containing above domains as PVCs. It also provides vaccine potential of PVCs in terms of their possible immunogenicity by comparing with experimentally known IEDB epitopes, absence of autoimmunity and conservation in different strains. Predicted PVCs are prioritized so that only few prospective PVCs could be validated experimentally. The performance of web server was evaluated against known protective antigens from diverse classes of bacteria reported in Protegen database and datasets used for VaxiJen server development. The web server efficiently predicted known vaccine candidates reported from Streptococcus pneumoniae and Escherichia coli proteomes. The Jenner-Predict server outperformed NERVE, Vaxign and VaxiJen methods. It has sensitivity of 0.774 and 0.711 for Protegen and VaxiJen dataset, respectively while specificity of 0.940 has been obtained for the latter dataset. Conclusions: Better prediction accuracy of Jenner-Predict web server signifies that domains involved in host-pathogen interactions and pathogenesis are better criteria for prediction of PVCs. The web server has successfully predicted maximum known PVCs belonging to different functional classes. Jenner-Predict server is freely accessible at http://117.211.115.67/vaccine/home.html
Protein vaccine candidates (PVCs); Host-pathogen interactions; Domain; Antigen; Reverse vaccinology; Virulence
-
Background
In silico prediction has been proved to be of great
significance among various disciplines of life sciences including
biomedical research [1]. The conventional vaccine
development methods are time consuming as they require
cultivation of pathogenic microorganisms in laboratory
conditions and their dissection using microbiological,
biochemical and immunological methods in order to identify
the components important for immunogenecity. These
methods are ineffective in circumstances where the
cultivation of bacteria is difficult or impossible. The other
limitations arise when the expression of protective antigens is
less or absent in in vitro conditions compared to in vivo
diseased conditions [2]. In comparison to conventional
live attenuated vaccines, subunit vaccines are more reliable
as far as safety is concerned [3]. Vaccine candidate
identification is an essential and important component in
subunit vaccine development. The integration of genomics
in vaccine research (vaccinogenomics) is expected to
revolutionize novel vaccine candidate identification [4].
Computational approach, especially reverse vaccinology
(RV) method assists the identification of vaccine
candidates from genomes without culturing microorganisms
and thus facilitates the subunit vaccine development.
These methods are useful in reducing time, cost and
number of wet lab experiments [2].
The RV is a computational pipeline for identification of
vaccine candidates against microorganisms from their
genome sequences. Thus, all proteins of an organism can
be screened computationally for their vaccine potential.
Significant success of this principle for vaccine
development had already been demonstrated in several pathogens,
including Neisseria meningitides [5], Helicobacter pylori [6],
Streptococcus pneumoniae [7], Porphyromonas gingivalis
[8], Chlamydia pneumoniae [9] and Bacillus anthracis [10].
The relevance of this method was recognized when
vaccines developed from capsular polysaccharides of N.
meningitides B failed due to cross reactivity against human
tissue [5]. Application of RV techniques for PVC
identification and then in vivo testing led to the development of
licensed broad specificity protein vaccine, 5CVMB, against
N. meningitides. This vaccine contains 5 protein antigen
components, GNA2132, GNA1870, GNA1030, GNA2091
and NadA, which were primarily discovered by RV
methods [11]. However in earlier RV techniques, protein
localization (secretory, outer-membrane, transporter or
others) was used as the main criterion for identification
of PVCs. As a result, a large number of proteins were
required to be expressed, purified and tested to obtain
few vaccine candidates leading to enormous loss of cost
and time.
On the other hand, identifying immunogenic proteins
(PVCs) by using epitope prediction software and web
servers have several limitations. Comparative studies have
shown that B-cell epitopes (BCEs) and class-II
MHCbinding T-cell epitopes (TCEs) prediction methods are not
accurate [12-15]. Over-prediction, inability in exact
position prediction of epitopes and absence of success in
identifying known epitopes in proteins are major concerns
in vaccine candidate identification. Until now the available
PVCs prediction software and web servers have not been
much effective for identification of vaccine candidates
from genomes for vaccine design. VaxiJen server, based on
discriminant analysis and partial least square (DA-PLS)
methods, was developed by using datasets of known
(positive) protective antigenic and non-antigenic (negative)
proteins to predict PVCs [16]. Surprisingly, it predicts
more than half of proteins from a given bacterial
proteome as protective antigens with default parameters
making its usage almost impractical. Further, existing software
and web servers predict different proteins as vaccine
candid (...truncated)