MetaboLights: towards a new COSMOS of metabolomics data management
Christoph Steinbeck
0
1
2
3
Pablo Conesa
0
1
2
3
Kenneth Haug
0
1
2
3
Tejasvi Mahendraker
0
1
2
3
Mark Williams
0
1
2
3
Eamonn Maguire
0
1
2
3
Philippe Rocca-Serra
0
1
2
3
Susanna-Assunta Sansone
0
1
2
3
Reza M. Salek
0
1
2
3
Julian L. Griffin
0
1
2
3
0
R. M. Salek J. L. Griffin Elsie Widdowson Laboratory
, Fulbourn Road, Cambridge CB1 9NL,
UK
1
E. Maguire P. Rocca-Serra S.-A. Sansone Oxford e-Research Centre, University of Oxford
,
Oxford, UK
2
C. Steinbeck (&) P. Conesa K. Haug T. Mahendraker M. Williams European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus
, Hinxton, Cambridgeshire CB10 1SD,
UK
3
R. M. Salek J. L. Griffin Department of Biochemistry, University of Cambridge
, Cambridge CB2 1QW,
UK
Exciting funding initiatives are emerging in Europe and the US for metabolomics data production, storage, dissemination and analysis. This is based on a rich ecosystem of resources around the world, which has been build during the past ten years, including but not limited to resources such as MassBank in Japan and the Human Metabolome Database in Canada. Now, the European Bioinformatics Institute has launched MetaboLights, a database for metabolomics experiments and the associated metadata (http://www.ebi.ac.uk/metabolights). It is the first comprehensive, cross-species, cross-platform metabolomics database maintained by one of the major open access data providers in molecular biology. In October, the European COSMOS consortium will start its work on Metabolomics data standardization, publication and dissemination workflows. The NIH in the US is establishing 6-8 metabolomics services cores as well as a national metabolomics repository. This communication reports about MetaboLights as a new resource for Metabolomics research, summarises the related developments and outlines how they may consolidate the knowledge management in this third large omics field next to proteomics and genomics.
1 Introduction
Metabolomics has become an important phenotyping
technique for molecular biology and medicine. It assesses
the molecular state of an organism or collections of
organisms through the comprehensive quantitative and
qualitative analysis of all small molecules in cells, tissues,
and body fluids. Metabolic processes are at the core of
physiology. Consequently, metabolomics is ideally suited
as a medical tool to characterize disease states in
organisms, as a tool for assessment of organisms for their
suitability in, for example, renewable energy production, or for
biotechnological applications in general. In addition
application of metabolomics in environmental science,
toxicology, food and medical industry is well established,
growing and documented. Metabolomics studies generate
large amounts of analytical data (Giga- to Terabytes
depending on the size of the study) and therefore impose
significant challenges for biomedical and life science
e-infrastructures to cope with such data volumes and
ensure that the data are captured, stored and disseminated
based on open and widely accepted community standards.
Years after the first standardisation exercises (Fiehn et al.
2007; Taylor et al. 2008), metabolomics is now reaching
the state of a mature analytical technique as indicated by
the establishment of 68 Regional Comprehensive
Metabolomics Resource Cores (RCMRCs) by the NIH in
the United States (http://grants.nih.gov/grants/guide/
rfa-files/RFA-RM-11-016.html). In addition, we are now
facing a rich ecosystem of specialised metabolomics
databases, such as (Wishart et al. 2007; Kopka et al. 2005;
Smith et al. 2005; Skogerson et al. 2011) as well as the first
general metabolomics repositories (http://www.ebi.ac.uk/
metabolights) and databases emerging. In Europe, the
COSMOS consortium of 14 leading laboratories in
metabolomics will begin its work on standards, data
management and dissemination in metabolomics. Here, we outline
these developments and show how they may consolidate
the knowledge management in this third large omics field
next to proteomics and genomics.
2 MetaboLights: a cross-species repository
for metabolomics experiments
The European Bioinformatics Institute (EMBL-EBI) has
recently launched MetaboLights, a database for
metabolomics experiments and the associated metadata. It aims
to become the first comprehensive, cross-species,
crossplatform metabolomics database maintained by one of the
major open access data providers in molecular biology. The
EBI ensures long-term stability and maintenance of the
resource. Deposited datasets are assigned a stable identifier
of the form MTBLS1 (the first dataset ever deposited in
MetaboLights). These identifiers, like other stable
identifiers in bioinformatics, can be used to mark datasets in
publications or merge data in systems biology applications.
Fig. 2 MetaboLights data submission workflow
Fig. 1 MetaboLights general outline with repository and reference
layer. The reference layer is work currently in progress
Like all other EBI resources (...truncated)