JenPep: a database of quantitative functional peptide data for immunology (pdf)

Article PDF cannot be displayed. You can download it here:

https://bioinformatics.oxfordjournals.org/content/18/3/434.full.pdf

JenPep: a database of quantitative functional peptide data for immunology

Martin J. Blythe 0 Irini A. Doytchinova 0 Darren R. Flower 0 0 Edward Jenner Institute for Vaccine Research , Compton, Berkshire RG0 7NN , UK Motivation: The compilation of quantitative binding data underlies attempts to derive tools for the accurate prediction of epitopes in cellular immunology and is part of our concerted goal to develop practical computational vaccinology. Results: JenPep is a family of relational databases supporting the growing community of immunoinformaticians. It contains quantitative data on peptide binding to Major Histocompatibility Complexes (MHCs) and to Transmembrane Peptide Transporter (TAP), as well as an annotated list of T-cell epitopes. Availability: The database is available via the Internet. An HTML interface allowing searching of the database can be found at the following address: http://www.jenner.ac.uk/ JenPep. Contact: INTRODUCTION As the field of Bioinformatics has grown and matured into a new branch of science, new sub-disciplines have emerged within it. Immunoinformatics, the application of informatics and modelling techniques to molecules of the immune system, is one of the most exciting of these newly emergent sub-disciplines. One of the principal goals of immunoinformatics is to develop computer aided vaccine design, or computational vaccinology, and apply it to the quest for new vaccines. At the heart of computational vaccinology is the problem of epitope prediction. The focus of our present work is the development of a new database system in cellular, or T-cell, immunology. A specialized type of immune cell mediates cellular immunity: the T-cell. These cells constantly patrol the body hunting for foreign proteins originating from pathogenic organisms such as viruses or bacteria. T-cells express a particular kind of receptor: the T-Cell Receptor (TCR), which exhibits a wide range of selectivities and affinities. TCRs bind to Major Histocompatibility Complex (MHC) proteins presented on the surfaces of other cells. These proteins bind small peptide fragments, or epitopes, derived To whom correspondence should be addressed. from both host and pathogen proteins. It is recognition of such complexes that lies at the heart of both the adaptive, and memory, cellular immune response. The overall process leading to the cell-surface presentation of epitopes, derived from antigenic protein, is complex and not yet fully understood. There are two main antigen presentation pathways: classes I and II. Class I MHCs are expressed by most nucleated cells, albeit with some exceptions. T-cells, whose surfaces are rich in CD8 coreceptor protein, recognize class I MHCs. Class II MHCs are only expressed on so-called professional antigen presenting cells and are recognized by T-cells whose surfaces are rich in CD4 co-receptors. Class I peptides are typically, but not exclusively, derived from intracellular proteins, such as viruses. These proteins are targeted to the proteasome, which cleaves them into short peptides of 811 amino acids in length. These peptides are bound by the Transmembrane Peptide Transporter (TAP), which translocates them from the cell cytoplasm into the Endoplasmic Reticulum (ER), where they are in turn bound by MHC protein. For class II, receptor mediated ingestion of extracellular protein derived from a pathogen is targeted to an endosomal compartment where the proteins are cleaved by cathepsins, to produce peptides of 1520 amino acids. Class II MHCs then bind these peptides. Peptide bound MHCs are presented on the surface of the cell where they are recognized, as T-cell epitopes, by T-cells. MHC proteins are polymorphic, each exhibiting slightly different peptide selectivities. The combination of MHC and TCR selectivities determines the power and scope of peptide recognition in the immune system and thus the recognition of foreign and self-antigenic peptides. Experimental work has established that only peptides that bind with high affinity to MHC molecules are recognized as T-cell epitopes by TCRs (Sette et al., 1994a,b). Weaker or non-binding peptides are simply not recognized. Expressed in terms of a competition assay, the IC50 must be less than 500 nM. IC50 values are binding affinities measured using a radioisotopelabeled reference peptide. Prediction of MHC binding is thus a pre-requisite to the prediction of T-cell epitopes. Most attempts to predict binding peptides have attempted to simplify the task by using a classification scheme, dividing peptides into non-binders, low affinity binders, medium affinity binders, or high affinity binders. Again, in terms of IC50 values: non-binders show no affinity, low binders > 500 nm, 500 nm > medium binders > 50 nm, and high binders < 50 nm. However, more recent work has turned to the development of fully quantitative models (Rognan et al., 1999; Doytchiniva and Flower, 2001, 2002). To achieve this we must have access to a database of allele-specific quantitative binding data. It is only from data of this type that we can build statistically accurate models for the prediction of binding. To accurately model the process we need to focus on well characterized data for the binding of peptides to TAP and to MHCs, and their subsequent functioning as T-cell epitopes. Certain groups have access to some of these data, but currently there is no publicly available database or compilation. As part of our attempts to develop computational vaccinology, we have set about constructing such a database, which we have called JenPep. The following paper describes version 1.0 of this database. SYSTEMS AND METHODS Database size and structure Version 1.0 of JenPep is composed of three component sub-databases: a compilation of quantitative measures of binding for peptides to classes I and II MHCs; a compendium of dominant and subdominant T-cell epitopes, and a similar set of quantitative data for peptide binding to TAP peptide transporter. This compilation was derived through exhaustive, semi-manual searching of the primary literature. We have used extensive searching of available literature databases, using keyword and author searches, retrospective searching, citation matching of key authors (particularly those describing the development of an assay system), to identify new papers detailing experimental quantitative measured values. The database is organized on the basis of peptides, which are defined by their sequence and length. A schematic of the database structure is included in Figure 1. Peptide origin. Information on the origin of the peptide is taken from the reference paper and, failing that, from results obtained using BLAST (Altschul et al., 1997). A hypertext link is made to the corresponding SWISS-PROT entry. The reference sequence is taken from that most closely matching the peptide as published. Restriction allele. Information on the MHC restriction allele is given for all entries except those in the TAP database. MHC nomenclature has been standardized to the best of our a (...truncated)