T-Cell Epitope Prediction: Rescaling Can Mask Biological Variation between MHC Molecules
Asquith B (2009) T-Cell Epitope Prediction: Rescaling Can Mask Biological Variation between MHC
Molecules. PLoS Comput Biol 5(3): e1000327. doi:10.1371/journal.pcbi.1000327
T-Cell Epitope Prediction: Rescaling Can Mask Biological Variation between MHC Molecules
Aidan MacNamara 0
Ulrich Kadolsky 0
Charles R. M. Bangham 0
Becca Asquith 0
Rob J. De Boer, Utrecht University, The Netherlands
0 Department of Immunology, Imperial College School of Medicine , London , United Kingdom
Theoretical methods for predicting CD8+ T-cell epitopes are an important tool in vaccine design and for enhancing our understanding of the cellular immune system. The most popular methods currently available produce binding affinity predictions across a range of MHC molecules. In comparing results between these MHC molecules, it is common practice to apply a normalization procedure known as rescaling, to correct for possible discrepancies between the allelic predictors. Using two of the most popular prediction software packages, NetCTL and NetMHC, we tested the hypothesis that rescaling removes genuine biological variation from the predicted affinities when comparing predictions across a number of MHC molecules. We found that removing the condition of rescaling improved the prediction software's performance both qualitatively, in terms of ranking epitopes, and quantitatively, in the accuracy of their binding affinity predictions. We suggest that there is biologically significant variation among class 1 MHC molecules and find that retention of this variation leads to significantly more accurate epitope prediction.
-
Funding: This work was supported by the Wellcome Trust UK [WT078326] and RCUK. The funders had no role in study design, data collection and analysis,
decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
Cytotoxic T lymphocytes (CTLs) discriminate between healthy
and pathogen-infected cells by recognizing and responding to a
molecular complex on the surface of the infected cell. This
complex consists of a specific major histocompatibility complex
(MHC) molecule and a peptide derived from the proteins
contained in the cell. If the cell contains a pathogen, peptides
from the pathogen proteome will be presented and, with the right
MHC peptide complex, a CTL response will be elicited.
Of the large number of peptides that can be derived from a
pathogen only a small minority elicits a CTL response. This number
has been estimated to be between 1 in 2,000 and 1 in 5,600 [1,2].
This limitation in the number of peptides that are immunogenic is
conferred by three main constraints: the requirement for peptide
cleavage and transport, the requirement for MHC-peptide binding
and the requirement for CTL recognition. By far the most stringent
of these is the requirement for MHC-peptide binding, because only
1 in 40200 peptides binds a specific MHC molecule with sufficient
affinity to elicit an immune response [1,2]. Further selection is
largely due to the limitations of peptide processing and transport. In
these processes, individual peptides are produced from the
precursor polypeptides by proteasomal cleavage of the polypeptide,
which can be followed by N-terminal trimming by other peptidases.
This is followed by the transport of the peptides from the cytosol to
the endoplasmic reticulum, mediated by the TAP complex. Further
N-terminal trimming may occur before the peptide binds to the
MHC molecule. The requirements of processing and transport
eliminate approximately 80% of potential epitopes [1]. Finally, T
cell specificity, i.e. the requirement for T cell receptor binding of the
MHC-peptide complex, further halves the number of presented
peptides that elicit a response. The probability of each of these steps
is determined by the polypeptide sequence, amongst other factors
[3].
Once CTLs recognize the MHC-peptide complex, they are
capable of destroying the infected cell by the release of lytic
granules containing cytotoxic effector proteins. This results in the
destruction of the target cell by apoptosis. An effective CTL
response has been shown to confer protection against viral
infection, such as HIV [4] and HTLV-I [5]. Hence, the
identification of T cell epitopes is of vital importance in the
design of vaccines and understanding of the immune system
[6,7,8]. However, given the scarcity of epitopes, experimentally
screening all possible peptides for each MHC allele (e.g. by IFNc
ELISpot) is time consuming, expensive and inefficient. One way to
improve the efficiency of the identification process is to first use
theoretical algorithms to predict which peptides are more likely to
be epitopes and then experimentally screen this much smaller,
selected list of peptides. This method is widely used [912] and has
been applied in a number of studies to identify potential vaccine
targets [13,14]. The use of theoretical methods to pre-screen
peptides is of particular importance in the case of emerging
infections such as avian influenza [15] where rapid vaccine
development would be vital. This approach underpins a large
biopreparedness initiative coordinated by the Large-Scale Antibody
and T Cell Epitope Discovery Program [7], which intends to foster
development of immune-based therapeutics for emerging and
reemerging pathogens including potential bioterrorism agents.
The accuracy of these methods has also been demonstrated by the
prediction of the vast majority of CTL epitopes from the vaccinia
virus [16].
The use of prediction software has become an important
tool in increasing our knowledge of infectious disease. It
allows us to predict the interaction of molecules involved
in an immune response, thereby significantly shortening
the lengthy process of experimental elucidation. A high
proportion of this software has focused on the response of
the immune system against pathogenic viruses. This
approach has produced positive results towards vaccine
design, results that would be delayed or unobtainable
using a traditional experimental approach. The current
challenge in immunological prediction software is to
predict interacting molecules to a high degree of accuracy.
To this end, we have analysed the best software currently
available at predicting the interaction between a viral
peptide and the MHC class I molecule, an interaction that
is vital in the bodys defence against viral infection. We
have improved the accuracy of this software by
challenging the assumption that different MHC class I molecules
will bind to the same number of viral peptides. Our
method shows a significant improvement in correctly
predicting which viral peptides bind to MHC class I
molecules.
More generally, epitope prediction algorithms are being
increasingly used to understand the CTL response. For example,
in the case of HIV-1 infection, algorithms have been used to
confirm which MHC-associated epitope mutations are likely to
confer escape (...truncated)