An integrated bioinformatics approach to improve two-color microarray quality-control: impact on biological conclusions

Genes & Nutrition, Mar 2009

Omics technology used for large-scale measurements of gene expression is rapidly evolving. This work pointed out the need of an extensive bioinformatics analyses for array quality assessment before and after gene expression clustering and pathway analysis. A study focused on the effect of red wine polyphenols on rat colon mucosa was used to test the impact of quality control and normalisation steps on the biological conclusions. The integration of data visualization, pathway analysis and clustering revealed an artifact problem that was solved with an adapted normalisation. We propose a possible point to point standard analysis procedure, based on a combination of clustering and data visualization for the analysis of microarray data.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1007%2Fs12263-009-0115-8.pdf

An integrated bioinformatics approach to improve two-color microarray quality-control: impact on biological conclusions

Rachel I. M. van Haaften 0 1 Cristina Luceri 0 1 Arie van Erk 0 1 Chris T. A. Evelo 0 1 0 C. Luceri Department of Pharmacology, University of Florence , Florence, Italy 1 R. I. M. van Haaften A. van Erk C. T. A. Evelo (&) Department of Bioinformatics-BiGCaT, Maastricht University , UNS50, Box 19, P.O. Box 616, 6200 MD Maastricht, The Netherlands Omics technology used for large-scale measurements of gene expression is rapidly evolving. This work pointed out the need of an extensive bioinformatics analyses for array quality assessment before and after gene expression clustering and pathway analysis. A study focused on the effect of red wine polyphenols on rat colon mucosa was used to test the impact of quality control and normalisation steps on the biological conclusions. The integration of data visualization, pathway analysis and clustering revealed an artifact problem that was solved with an adapted normalisation. We propose a possible point to point standard analysis procedure, based on a combination of clustering and data visualization for the analysis of microarray data. - Rapid evolution occurs for microarray technology, used for large-scale measurements of gene expression at mRNA level in biomedical research. Studies using this technology yield huge amounts of data which have to be analyzed in a correct way to eventually give useful information about the physiological outcome of the experiment. In the process from array production to final physiological outcome of a microarray experiment, numerous things can have a large impact on the interpretation of the final results of an experiment. The construction of a microarray requires the production of a large number of correct probes and accurate spotting of the probes onto the glass slides. Many factors can influence the spotting, e.g., blocked spotting pins, glass slide surface treatment and environmental conditions [13]. Those and other technical issues during microarray preparation can influence the spot quality which can be detected after image analysis of the scanned microarray images. Spot quality can be documented by, e.g., signal-tonoise ratio, spot size irregularity, intensity saturation status, intensity distribution issues as a consequence of non-specific binding or irregular distribution of the printed DNA on the slide, morphological issues and background issues [4, 5]. Next to the production of the microarray the final results of an experiment can also be influenced by the quality of the initial RNA sample before hybridization and by the researcher performing the actual hybridization of the sample onto the array [68]. Some of the sources of variation can be removed or minimized by removing bad spots from further analyses or at the worst case removal of a complete array from further analysis [912]. After judging about the quality of the array, functional data analysis can be performed which should lead, finally to a biological conclusion. The current paper describes a workflow for quality control and analysis of two-color microarray data. To test the proposed workflow we analyzed data obtained from an experiment setup to explore the possible mechanisms for the protective effects of dietary polyphenols on colon mucosa. A number of studies in fact demonstrated that treatments with polyphenols had chemopreventive effects against colon carcinogenesis [1315], probably linked to their antioxidant [16], pro-apoptotic [13] and anti-inflammatory activities. The paper demonstrated that an insufficient quality control and not correct normalisation can lead to wrong biological conclusions. Materials and methods Microarray construction The microarrays were constructed using the Rat Genome Oligo Set version 1.1 (Operon Technologies, CA, USA), composed of 70mer probes representing 5,677 well-characterized Rattus norvegicus genes divided into seventeen 384-wells plates. The oligonucleotides were spotted with an OmniGrid 100 microarrayer (Genomic Solutions, Ann Arbor, MI, USA) onto poly-L-lysine glass slides (Erie Scientific Company Portsmouth, NH, USA), on the same day, using a print head with 16 pins. The Operon plates were inserted in the machine, from plate 1st to 17th, thus the oligos from every plate will end up distributed over all blocks. Animals and samples In the experiment, two groups of rats were compared: the control group consisted of 10 males, 56-week-old, Fischer 344 (F344) rats (Nossan, Correzzana, Milan, Italy) fed a high fat diet (control diet) for 2 weeks. The high fat diet was based on the AIN76 diet [17] modified to contain a Fig. 1 a Hierarchical clustering after the first data analysis; genes are shown in a dendrogram based on the similarity between ten rats. b Hierarchical clustering after a local normalisation high level of fat (23% corn oil w/w) and a low level of cellulose (2% w/w) to mimic the high risk of colon cancer in human populations consuming high fat diets. The experimental group consisted of 10 males, 56-week-old, (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs12263-009-0115-8.pdf

Rachel I. M. van Haaften, Cristina Luceri, Arie van Erk, Chris T. A. Evelo. An integrated bioinformatics approach to improve two-color microarray quality-control: impact on biological conclusions, Genes & Nutrition, 2009, pp. 123-127, Volume 4, Issue 2, DOI: 10.1007/s12263-009-0115-8