MaizeGDB becomes ‘sequence-centric’
0
USDA-ARS Plant Genetics Research Unit and Division of Plant Sciences, University of Missouri
,
Columbia, MO 65211
1
Department of Genetics, Development and Cell Biology, Iowa State University
, Ames,
IA 50011
2
Present address: Michael E. Sparks,
USDA-ARS Bovine Functional Genomics Laboratory
, Beltsville,
MD, USA
3
USDA-ARS Corn Insects and Crop Genetics Research Unit
, Ames,
IA 50011
4
Department of Statistics, Iowa State University
, Ames,
IA 50011, USA
5
Department of Molecular and Cell Biology, University of California
,
Berkeley, CA 94720
6
USDA-ARS Plant Gene Expression Center
,
Albany, CA 94710
MaizeGDB is the maize research community's central repository for genetic and genomic information about the crop plant and research model Zea mays ssp. mays. The MaizeGDB team endeavors to meet research needs as they evolve based on researcher feedback and guidance. Recent work has focused on better integrating existing data with sequence information as it becomes available for the B73, Mo17 and Palomero Toluquen o genomes. Major endeavors along these lines include the implementation of a genome browser to graphically represent genome sequences; implementation of POPcorn, a portal ancillary to MaizeGDB that offers access to independent maize projects and will allow BLAST similarity searches of participating projects' data sets from a single point; and a joint MaizeGDB/PlantGDB project to involve the maize community in genome annotation. In addition to summarizing recent achievements and future plans, this article also discusses specific examples of community involvement in setting priorities and design aspects of MaizeGDB, which should be of interest to other database and resource providers seeking to better engage their users. MaizeGDB is accessible online at http://www.maizegdb.org. Database URL: http://www.maizegdb.org
Introduction
Maize is one of very few species that serve both as an
important research model and as a crop from which
diverse products and resources are generated [reviewed
in (1, 2)]. This breadth of scope is recapitulated by the
wide variety of informatics needs expressed by the
community of maize biologistsnot only are tools for handling
genetic and genomic information needed, support for
translational and applied research is also of great interest
[reviewed in (3)].
To better understand the broad needs of the research
community and prioritize development goals, a Working
Group (http://www.maizegdb.org/working_group.php)
made up of maize geneticists and computational biologists
meets annually to discuss the MaizeGDB projects status
and to suggest how to further develop the MaizeGDB
resource. In addition, the maize community periodically
organizes meetings to gather information on key needs
to move maize research forward. In March 2007, lab
heads met at the Allerton Park and Conference Center in
Monticello, IL, to discuss The Future of Maize Genetics
[meeting report available at http://www.maizegdb
.org/AllertonReport.doc and (4)]. Guidance from the
MaizeGDB Working Group and Allerton reports agree
that two needs are of the utmost priority: improving
access to the genome sequence of inbred line B73 (as well
as other maize genome sequences as they become
available) and creating tools to improve phenotype data
collection, storage and analysis. With this in mind, sequence
data and phenotypes constitute much of the current
MaizeGDB Project Plan, a document that outlines work to
be accomplished by MaizeGDB over a 5-year period (2009
14). In brief, the goals are as follows:
(1) to integrate new maize genetic and genomic data
into the database by
expanding mutant and phenotype data and tools
as well as structural and genetic map sets
emphasizing the integration of the IBM genetic maps
with the B73 genome sequence;
creating views that convey the substantial variation
in maize genome structure;
integrating the next-generation genetic map being
generated by the Maize Diversity Project (5) into a
genomic view to enable its effective use by plant
breeders;
providing access to gene models calculated by
leading gene structure prediction groups through the
MaizeGDB interface;
compiling and making accessible the annual Maize
Newsletter at MaizeGDB
and
(2) to provide community support services, such as
lending help to the community of maize researchers with
respect to developing and publicizing a set of
guidelines for researchers to follow to ensure that their
data can be made available through MaizeGDB;
coordinating annual meetings; and conducting elections
and surveys.
MaizeGDB currently has a wide range of maize data
including genetic maps, gene products, loci, alleles, phenotypes,
stocks, sequences and markers. However, centralized access
to currently ongoing maize projects that create
sequenceindexed data (roughly 1015 projects at any given time) is
notably lacking. Reported here are some recent updates to
MaizeGDB with emphasis on improving the handling and
accessibility to sequence data, especially data generated
by the Maize Genome Sequencing Project for B73 (6).
Of particular note are (i) the new MaizeGDB Genome
Browser (see Genomic sequence data display and
integration with genetic maps section), (ii) a new project ancillary
to MaizeGDB called POPcorn, which currently serves as a
portal to maize research projects with a centralized maize
sequence similarity search resource coming soon and (iii) a
recently launched project to involve the community of
maize geneticists in genome annotation for B73 (outlined
in Current endeavors section).
MaizeGDBs standard operating procedures, machine
architecture, accessibility and a description of how the
databases are administered are described elsewhere (1, 2).
Data made available via MaizeGDB are in the public
domain.
Genomic sequence data
display and integration
with genetic maps
Genome browser
Based upon the 2006 MaizeGDB Working Group guidance
(available at the bottom of http://www.maizegdb.org/
working_group.php) and the Allerton meeting report (4),
the MaizeGDB Team began development toward making
MaizeGDB become more sequence-centric in early 2007. To
this end, an initiative to implement a MaizeGDB Genome
Browser was launched in early 2008 and completed in
December 2008. The MaizeGDB Genome Browser enables
MaizeGDB to become the long-term and centralized keeper
of maize gene models (which ensures proper
nomenclature) and serves as a way to compare various groups
assemblies and annotations simultaneously.
A variety of genome browser applications were
evaluated via a survey prepared on behalf of the Maize
Genetics Executive Committee (accessible online at http://
www.maizegdb.org/blanksurvey.html) to gauge
cooperators impressions of existing software and to find out what
functionalities they would like to have in a maize genome
browser. A summary of the survey results is available online
at http://www.maizegdb.org/genome_browser_survey.php.
Based upon results of the survey, GBrowse (7) was
selected for th (...truncated)