An integrated bioinformatics approach to improve two-color microarray quality-control: impact on biological conclusions
Rachel I. M. van Haaften
0
1
Cristina Luceri
0
1
Arie van Erk
0
1
Chris T. A. Evelo
0
1
0
C. Luceri Department of Pharmacology, University of Florence
, Florence,
Italy
1
R. I. M. van Haaften A. van Erk C. T. A. Evelo (&) Department of Bioinformatics-BiGCaT, Maastricht University
, UNS50, Box 19, P.O. Box 616, 6200 MD Maastricht,
The Netherlands
Omics technology used for large-scale measurements of gene expression is rapidly evolving. This work pointed out the need of an extensive bioinformatics analyses for array quality assessment before and after gene expression clustering and pathway analysis. A study focused on the effect of red wine polyphenols on rat colon mucosa was used to test the impact of quality control and normalisation steps on the biological conclusions. The integration of data visualization, pathway analysis and clustering revealed an artifact problem that was solved with an adapted normalisation. We propose a possible point to point standard analysis procedure, based on a combination of clustering and data visualization for the analysis of microarray data.
-
Rapid evolution occurs for microarray technology, used for
large-scale measurements of gene expression at mRNA
level in biomedical research. Studies using this technology
yield huge amounts of data which have to be analyzed in a
correct way to eventually give useful information about the
physiological outcome of the experiment. In the process
from array production to final physiological outcome of a
microarray experiment, numerous things can have a large
impact on the interpretation of the final results of an
experiment.
The construction of a microarray requires the production
of a large number of correct probes and accurate spotting of
the probes onto the glass slides. Many factors can influence
the spotting, e.g., blocked spotting pins, glass slide surface
treatment and environmental conditions [13].
Those and other technical issues during microarray
preparation can influence the spot quality which can be
detected after image analysis of the scanned microarray
images. Spot quality can be documented by, e.g.,
signal-tonoise ratio, spot size irregularity, intensity saturation status,
intensity distribution issues as a consequence of
non-specific binding or irregular distribution of the printed DNA
on the slide, morphological issues and background issues
[4, 5]. Next to the production of the microarray the final
results of an experiment can also be influenced by the
quality of the initial RNA sample before hybridization and
by the researcher performing the actual hybridization of the
sample onto the array [68]. Some of the sources of
variation can be removed or minimized by removing bad spots
from further analyses or at the worst case removal of a
complete array from further analysis [912]. After judging
about the quality of the array, functional data analysis can
be performed which should lead, finally to a biological
conclusion.
The current paper describes a workflow for quality
control and analysis of two-color microarray data. To test
the proposed workflow we analyzed data obtained from an
experiment setup to explore the possible mechanisms for
the protective effects of dietary polyphenols on colon
mucosa. A number of studies in fact demonstrated that
treatments with polyphenols had chemopreventive effects
against colon carcinogenesis [1315], probably linked to
their antioxidant [16], pro-apoptotic [13] and
anti-inflammatory activities.
The paper demonstrated that an insufficient quality
control and not correct normalisation can lead to wrong
biological conclusions.
Materials and methods
Microarray construction
The microarrays were constructed using the Rat Genome
Oligo Set version 1.1 (Operon Technologies, CA, USA),
composed of 70mer probes representing 5,677
well-characterized Rattus norvegicus genes divided into seventeen
384-wells plates. The oligonucleotides were spotted with
an OmniGrid 100 microarrayer (Genomic Solutions, Ann
Arbor, MI, USA) onto poly-L-lysine glass slides (Erie
Scientific Company Portsmouth, NH, USA), on the same
day, using a print head with 16 pins. The Operon plates
were inserted in the machine, from plate 1st to 17th, thus
the oligos from every plate will end up distributed over all
blocks.
Animals and samples
In the experiment, two groups of rats were compared: the
control group consisted of 10 males, 56-week-old, Fischer
344 (F344) rats (Nossan, Correzzana, Milan, Italy) fed a
high fat diet (control diet) for 2 weeks. The high fat diet
was based on the AIN76 diet [17] modified to contain a
Fig. 1 a Hierarchical clustering
after the first data analysis;
genes are shown in a
dendrogram based on the
similarity between ten rats.
b Hierarchical clustering after a
local normalisation
high level of fat (23% corn oil w/w) and a low level of
cellulose (2% w/w) to mimic the high risk of colon cancer
in human populations consuming high fat diets. The
experimental group consisted of 10 males, 56-week-old, (...truncated)