The Stanford Tissue Microarray Database
Robert J. Marinelli
1
2
Kelli Montgomery
0
Chih Long Liu
4
Nigam H. Shah
4
Wijan Prapong
4
Michael Nitzberg
2
Zachariah K. Zachariah
2
Gavin J. Sherlock
3
Yasodha Natkunam
0
Robert B. West
0
Matt van de Rijn
0
Patrick O. Brown
1
2
Catherine A. Ball
2
0
Department of Pathology, Stanford University School of Medicine
1
Howard Hughes Medical Institute
2
Department of Biochemistry, Stanford University School of Medicine
3
Department of Genetics, Stanford University School of Medicine
,
Stanford, CA, USA
4
Department of Medicine, Stanford University
The Stanford Tissue Microarray Database (TMAD; http://tma.stanford.edu) is a public resource for disseminating annotated tissue images and associated expression data. Stanford University pathologists, researchers and their collaborators worldwide use TMAD for designing, viewing, scoring and analyzing their tissue microarrays. The use of tissue microarrays allows hundreds of human tissue cores to be simultaneously probed by antibodies to detect protein abundance (Immunohistochemistry; IHC), or by labeled nucleic acids (in situ hybridization; ISH) to detect transcript abundance. TMAD archives multi-wavelength fluorescence and brightfield images of tissue microarrays for scoring and analysis. As of July 2007, TMAD contained 205 161 images archiving 349 distinct probes on 1488 tissue microarray slides. Of these, 31 306 images for 68 probes on 125 slides have been released to the public. To date, 12 publications have been based on these raw public data. TMAD incorporates the NCI Thesaurus ontology for searching tissues in the cancer domain. Image processing researchers can extract images and scores for training and testing classification algorithms. The production server uses the Apache HTTP Server, Oracle Database and Perl application code. Source code is available to interested researchers under a no-cost license.
-
The Tissue Microarray Database (TMAD; http://tma.
stanford.edu) at Stanford University is a web-based
system that provides researchers with tissue microarray
design tools, image scoring and annotation tools, data
sharing mechanisms, an image archive, an analysis toolset
and publication mechanism. Tissue microarray
experiments provide in situ detection of protein, DNA and RNA
targets on hundreds of tissue specimens per slide through
chromogenic and fluorescence stains. Images at
subcellular resolution of each specimen are taken for subsequent
scoring and analysis. Each image is rich in multivariate
information including cell composition and morphology
as well as stain localization.
In 1987, Wan et al. (1) described a method to
immunohistochemically stain many different tissues
simultaneously on a single slide, the stated advantages being
great economies in time, reagents, tissue specimens and
antibodies. Tissue microarrays in their current form were
developed by Kallioniemi and Sauter (2) for
highthroughput molecular profiling of tissue specimens.
Twenty years later these advantages have proven to be
true, and today the Stanford Tissue Microarray Database
contains over 200 000 stained and scored tissue microarray
images along with associated tissue metadata describing
the tissues, associated clinical diagnosis and follow-up
where available. TMAD includes tools for tissue
microarray design, image and scoring import, and analysis tools
via an intuitive web interface.
Several database object models (3,4) and systems (510)
have been described for managing tissue microarray data.
Goals range from metadata modeling to comprehensive
management of tissue microarrays for large research
groups. While there are similarities, TMAD differs by
providing public access to raw tissue microarray
experiment data. As part of ongoing collaborations with
nonUS research groups we have constructed a straightforward
method to import images and metadata from
collaborating institutions, eliminating sample and slide
transportation between institutes and resulting complications and
delays.
The Human Protein Atlas project (11,12) has published
a comprehensive public access antibody-based protein
atlas based on the systematic creation of protein-specific
antibodies applied to tissue microarrays and used to create
expression and localization profiles in 48 normal human
tissues, 20 varied cancers as well as 47 cell lines. Their
version 2.0 Atlas available at http://www.proteinatlas.org/
includes over 1 200 000 images corresponding to over 1500
antibodies. We believe that TMAD provides a
complementary service with selected probe data across a wider
variety of disease tissues along with an integrated tissue
microarray toolset.
The Nordic Immunohistochemical Quality Control
organization (13) publishes very detailed IHC results
including thousands of images for clinically important
epitopes. Their data comes from over 100 laboratories
that participate in quality control studies by performing
independent stains on serial sections of multiple tissue
blocks that are then verified independently. Their in-depth
information on antigens and protocols is available at
http://www.nordiqc.org/. While TMAD includes standard
clinical antibody probes, it adds many novel emerging
antibody probes useful for the molecular sub-classification
of cancers.
We designed TMAD to allow for the release of raw
supporting data (including images) at the time of
publication for all experiments held in TMAD.
Researchers using TMAD observe a policy of making
data publicly available through TMAD at the point of
publication (or earlier) (1420). We have implemented
automated mechanisms that allow tagging the complete
set of experiments associated with each new publication,
resulting in nearly one click publication of the raw data
(stained images and scores assigned by pathologists)
through TMAD.
As of July 2007, TMAD contained 205 161 images
archiving 349 distinct probes on 1488 stained tissue
microarray slides. Of these, 31 306 images for 68 probes
on 125 slides have been released to the public.
By focusing on the release of data for public use, we
anticipate improved collaboration among data model and
database developers. Our real world data can be used to
validate both object models and eXtensible Markup
Language (XML) (21) based tissue microarray data
exchange specifications (22,23). Images from multiple
automated microscopes using varied imaging modalities
and stains should provide rich training and test datasets.
As our user community is located around the world, all
user interaction is via the Internet through
standardscompliant web browser pages. All functions are available
to authenticated Stanford researchers and their
collaborators with authorization to access given experiments
governed by experiment to group mappings maintained
in the database. Data access is restricted by group until
publication, at which time it is made visible to the public.
Public data may be searched, the analysis pipeline may be
run and both input and output datasets may be freely
downloaded (...truncated)