Enteropathogen Resource Integration Center (ERIC): bioinformatics support for research on biodefense-relevant enterobacteria
Jeremy D. Glasner
2
3
Guy Plunkett III
1
2
Bradley D. Anderson
2
3
David J. Baumler
2
3
Bryan S. Biehl
2
3
Valerie Burland
1
2
3
Eric L. Cabot
2
3
Aaron E. Darling
0
2
Bob Mau
2
3
Eric C. Neeno-Eckwall
2
3
David Pot
2
5
Yu Qiu
2
4
Anna I. Rissman
2
3
Sara Worzella
2
3
Sam Zaremba
2
5
Joel Fedorko
2
5
Tom Hampton
2
5
Paul Liss
2
3
Michael Rusch
2
3
Matthew Shaker
2
5
Lorie Shaull
2
5
Panna Shetty
2
5
Silpa Thotakura
2
5
Jon Whitmore
2
5
Frederick R. Blattner
1
2
3
John M. Greene
2
5
Nicole T. Perna
1
2
3
0
University of Queensland, Institute for Molecular Bioscience
, St Lucia Q 4072,
Australia
1
Laboratory of Genetics, University of Wisconsin
, 425G Henry Mall,
Madison, WI 53706, USA
2
livestock. The Enteropathogen Resource Integration Center (ERIC, www.ericbrc.org) is one of the eight Bioinformatics Resource Centers (BRC) for Biodefense and Emerging/Re-Emerging Infectious Diseases (http:// www.brc-central.org/). Funded by the National Institute of Allergy and Infectious Diseases (NIAID)
, ERIC serves as an information resource for enterobacteria on the NIAID established list of select agents related to biodefense diarrheagenic Escherichia coli, Shigella spp., Salmonella spp.,
Yersinia enterocolitica and Yersinia pestis. ERIC seeks to support basic research on pathogenesis and development of novel vaccines
, therapeutics and diagnostics for these organisms by:
3
Genome Center, University of Wisconsin
, 425G Henry Mall,
Madison, Madison, WI 53703
4
University of California
,
San Diego
,
Bioengineering
, 9500 Gilman Drive,
La Jolla, CA 92093, USA
5
SRA International
, Inc., 11300 Rockville Pike, Suite 501, Rockville MD 20852
ERIC, the Enteropathogen Resource Integration Center (www.ericbrc.org), is a new web portal serving as a rich source of information about enterobacteria on the NIAID established list of Select Agents related to biodefensediarrheagenic Escherichia coli, Shigella spp., Salmonella spp., Yersinia enterocolitica and Yersinia pestis. More than 30 genomes have been completely sequenced, many more exist in draft form and additional projects are underway. These organisms are increasingly the focus of studies using high-throughput experimental technologies and computational approaches. This wealth of data provides unprecedented opportunities for understanding the workings of basic biological systems and discovery of novel targets for development of vaccines, diagnostics and therapeutics. ERIC brings information together from disparate sources and supports data comparison across different organisms, analysis of varying data types and visualization of analyses in human and computer-readable formats.
-
The family Enterobacteriaceae includes a variety of
pathogens that pose significant threats to human health
directly, and indirectly through agricultural crops and
THE ERICBRC PORTAL OFFERS INTEGRATED
ACCESS TO ALL TOOLS AND ANALYSES
The ERICBRC is a web portal that provides a single
point of access to information about the focus organisms.
The web portal, implemented with the JBoss Application
Server 4.05GA and JBoss Portal Server 2.4.1, provides
a single, standardized method of accessing the diverse
resources integrated into the system. In addition to
the specific resources described in the sections below, the
portal provides general information about pathogenic
enterobacteria, summaries of the genome database
contents, and links to other relevant databases, such as the
Immune Epitope Database (1) a curated set of epitopes for
the Category AC select agents. New functionalities and
data sets are added to existing sections of the portal
when appropriate or incorporated into new portlets within
the main ERIC portal. This architecture permits rapid
deployment of new components and customizable display
of contents.
ERICASAP GENOME ANNOTATIONS
ERIC provides access to continuously updated genome
annotations for all ERIC pathogens, as well as
information from a variety of other enterobacteria useful for
reference and comparison, including E. coli K-12
(Table 1). ERIC uses the ASAP genome annotation
database system (2) using an Oracle 10 g database for genome
annotation and curation. ERICASAP permits database
updates continuously, obviating the need for periodic
database releases that are a common feature of many
genome databases. There are three general types of user
accounts available for genome annotation purposes.
Administrator accounts permit users the full range of
capabilities including the ability to create new genome
projects in the system. Curator accounts give users the
ability to update ERIC annotations using sophisticated
web-based interfaces for manual annotation and curation
of information as well as tools for uploads of large sets
of annotation data. Annotator accounts provide users
with interfaces for manual annotation of individual
annotation records. The annotation interfaces are all
webbased and can be accessed by any member of the research
community that requests an account. The availability
of three different types of user accounts is designed to
meet the needs of different types of annotators and
to encourage training in use of the annotation tools that
can be used to update large numbers of annotation
records at a time. Genomes in ERIC can be either public
or private projects, with users assigned to any of the
three types of user accounts. All public genome sequence
data and annotations, including any newly added
information, are accessible without an account.
Our goal is to provide genome annotations that are
accurate, detailed, up-to-date and consistent across
genomes. Descriptions of the standard operating
procedures (SOPs) used by the ERIC curators are available for
download from the portal (http://www.ericbrc.org/portal/
eric/aboutasap). Every annotation record includes a
description of the evidence supporting the data, and this
is the primary way we assess the quality of the
annotation information and measure improvements over time.
Explanations of the evidence codes and how they are used
can be found in the SOP describing gene annotation
(http://www.ericbrc.org/portal/eric/sopCdsAnnotation).
ERICASAP is open for contribution by the research
community to encourage annotation by domain experts.
An additional layer of quality control is provided by a
curation status tag for each annotation that indicates
whether the information has been independently approved
by one of a select group of trusted users and dedicated
curators.
Sequences and annotations in ERIC can be downloaded
in a variety of formats including GenBank flatfile format
and GFF3. Files downloaded directly from ERIC reflect
continuous updates by the dedicated curatorial staff as
well as community-contributed annotations. Snapshots
of sequences annotated de novo by ERIC are also
deposited in GenBank. Examples include the genome of
Y. pestis strain CA88-4125 (GenBank accession number
ABCD00000000) and plasmid pMAR7 from
enteropathogenic Escherichia coli (3). ERIC is working (...truncated)