A minimal metadata set (MNMS) to repurpose nonclinical in vivo data for biomedical research
lab animal
Perspective
https://doi.org/10.1038/s41684-024-01335-0
A minimal metadata set (MNMS)
to repurpose nonclinical in vivo
data for biomedical research
Check for updates
Anastasios Moresis 1,10, Leonardo Restivo2,10, Sophie Bromilow 3, Gunnar Flik4, Giorgio Rosati5,
Fabrizio Scorrano6, Michael Tsoory7, Eoin C. O’Connor 8 , Stefano Gaburro 5 &
Alexandra Bannach-Brown 9
Although biomedical research is experiencing a data explosion, the accumulation of vast
quantities of data alone does not guarantee a primary objective for science: building upon
existing knowledge. Data collected that lack appropriate metadata cannot be fully interrogated
or integrated into new research projects, leading to wasted resources and missed opportunities
for data repurposing. This issue is particularly acute for research using animals, where concerns
regarding data reproducibility and ensuring animal welfare are paramount. Here, to address this
problem, we propose a minimal metadata set (MNMS) designed to enable the repurposing
of in vivo data. MNMS aligns with an existing validated guideline for reporting in vivo data
(ARRIVE 2.0) and contributes to making in vivo data FAIR-compliant. Scenarios where MNMS
should be implemented in diverse research environments are presented, highlighting opportunities
and challenges for data repurposing at different scales. We conclude with a ‘call for action’
to key stakeholders in biomedical research to adopt and apply MNMS to accelerate both the
advancement of knowledge and the betterment of animal welfare.
Biomedical research is experiencing a data explosion, fueled by recent
technological advancements that have accelerated data production capabilities. Data-rich multiomics approaches and high-resolution functional
measures, such as multimodal imaging or recordings of physiology and
behavior, are routinely being employed across the entire lifespan of model
organisms in both health and disease states.
On the one hand, this new era presents a great opportunity to
accelerate scientific understanding. On the other hand, the mere collection of vast amounts of data is not sufficient to ensure scientific
progress if these data cannot be interrogated and reintegrated into the
research cycle. One consequence of limited data sharing and poor transparency might be the need for repeated replication of prior findings,
frequently without success1–3. These common practices result in a substantial waste of resources and missed opportunities for data repurposing.
This topic is especially pertinent to research involving animals. Failures
to replicate findings and missed opportunities for data repurposing
undoubtedly lead to animal use that provides little or no new scientific
progress and is cause for ethical concern. Thus, there is an urgent need
to encourage and facilitate repurposing of nonclinical in vivo data in
biomedical research.
In Europe and North America, legislation for animal experimentation in biomedical research focuses heavily on implementation of the 3Rs
(see definition in Box 1), which encompasses the concepts of replacement, reduction and refinement4. The objective of the 3Rs is to ensure
that animal experimentation achieves the highest level of welfare while
minimizing burden through well-designed and reviewed animal research
protocols and procedures. Yet, despite this robust regulatory framework,
it is becoming increasingly clear that regulatory guidance protecting
1
Roche Pharma Research and Early Development, Data & Analytics, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland.
Neuro-Behavioral Analysis Unit, Faculty of Biology & Medicine, University of Lausanne, Lausanne, Switzerland. 3Group Legal Department,
F. Hoffmann-La Roche Ltd, Basel, Switzerland. 4Discovery, Charles River Laboratories, Groningen, the Netherlands. 5Tecniplast S.p.A., Buguggiate,
Italy. 6Emerging Technologies, Comparative Medicine, Novartis International AG, Basel, Switzerland. 7Behavioral and Physiological Phenotyping Unit,
Department of Veterinary Resources, Weizmann Institute of Science, Rehovot, Israel. 8Roche Pharma Research and Early Development, Neuroscience &
Rare Diseases, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, Switzerland. 9QUEST Center for Responsible Research, Berlin Institute
of Health at Charité–Universitätsmedizin Berlin, Berlin, Germany. 10These authors contributed equally: Anastasios Moresis, Leonardo Restivo.
e-mail: ; ;
2
Lab Animal | Volume 53 | March 2024 | 67–79
67
https://doi.org/10.1038/s41684-024-01335-0
Perspective
Box 1 | Definitions of key terms
API: an acronym that stands for application programming interface.
An API is a set of protocols for communication and automated data
transfer between two computer applications.
Ontology: an ontology is a system of carefully defined terminology,
connected by logical relationships and designed for both humans
and computers to use.
ARRIVE: the ARRIVE guidelines (Animal Research: Reporting of
In Vivo Experiments) were originally developed in 2010 to
improve the reporting of animal research. They consist of a
checklist of information to include in publications describing in vivo
experiments to enable others to scrutinize the work adequately,
evaluate its methodological rigor and reproduce the methods
and results18.
3Rs: an acronym that stands for replacement, reduction and
refinement. These are the guiding principles of animal research4.
Data repository: a data repository is a structure consisting of one
or more databases containing data for the purpose of analysis.
Data repositories are used in business to provide a centralized
source of information. A data repository may also be referred to
as a data library or a data archive.
Digital object: a digital object is any kind of data that exists in a
digital modality. A digital representation of a physical object or a
process is also a considered a digital object.
FAIR: an acronym that stands for Findable Accessible Interoperable
Reusable.
Meta-analysis: a meta-analysis is a statistical technique that
combines findings from multiple independent scientific studies.
In the clinical/preclinical context, meta-analysis is most often used
to assess the effectiveness of interventions by combining data from
several randomized trials.
Raw data: also known as primary or source data, raw data are data
(for example, numbers, instrument readings, figures and so on)
collected from a source that was not subjected to (1) processing,
(2) ‘cleaning’ by researchers to remove, for example outliers and
obvious instrument-reading errors, (3) any analysis (for example,
determining central tendency aspects such as the average or median
result) or (4) any other manipulation by a software program or a
human researcher, analyst or technician.
Note that raw data provide a great deal of flexibility in terms
of data repurposing, given that different questions can be asked
from the original dataset that may (...truncated)