Platform dependence of inference on gene-wise and gene-set involvement in human lung development

BMC Bioinformatics, Jun 2009

Background With the recent development of microarray technologies, the comparability of gene expression data obtained from different platforms poses an important problem. We evaluated two widely used platforms, Affymetrix U133 Plus 2.0 and the Illumina HumanRef-8 v2 Expression Bead Chips, for comparability in a biological system in which changes may be subtle, namely fetal lung tissue as a function of gestational age. Results We performed the comparison via sequence-based probe matching between the two platforms. "Significance grouping" was defined as a measure of comparability. Using both expression correlation and significance grouping as measures of comparability, we demonstrated that despite overall cross-platform differences at the single gene level, increased correlation between the two platforms was found in genes with higher expression level, higher probe overlap, and lower p-value. We also demonstrated that biological function as determined via KEGG pathways or GO categories is more consistent across platforms than single gene analysis. Conclusion We conclude that while the comparability of the platforms at the single gene level may be increased by increasing sample size, they are highly comparable ontologically even for subtle differences in a relatively small sample size. Biologically relevant inference should therefore be reproducible across laboratories using different platforms.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://www.biomedcentral.com/content/pdf/1471-2105-10-189.pdf

Platform dependence of inference on gene-wise and gene-set involvement in human lung development

BMC Bioinformatics Platform dependence of inference on gene-wise and gene-set involvement in human lung development Rose Du 0 1 2 Kelan Tantisira 0 2 5 Vincent Carey 0 2 5 Soumyaroop Bhattacharya 2 3 Stephanie Metje 2 Alvin T Kho 2 Barbara J Klanderman 2 Roger Gaedigk 4 Ross Lazarus 2 Thomas J Mariani 2 3 J Steven Leeder 4 Scott T W iss 0 2 5 0 Harvard Medical School , Boston, MA 02115 , USA 1 Department of Neurosurgery, Brigham and Women's Hospital , Boston, MA 02115 , USA 2 Channing Laboratory, Brigham and Women's Hospital , 181 Longwood Avenue, Boston, MA 02115 , USA 3 Department of Pediatrics, University of Rochester School of Medicine and Dentistry , Rochester, NY 14642 , USA 4 Children's Mercy Hospital, Division of Pediatric Pharmacology and Medical Toxicology , Kansas City, MO 64108 , USA 5 Center for Genomic Medicine, Brigham and Women's Hospital , Boston, MA 02115 , USA Background: With the recent development of microarray technologies, the comparability of gene expression data obtained from different platforms poses an important problem. We evaluated two widely used platforms, Affymetrix U133 Plus 2.0 and the Illumina HumanRef-8 v2 Expression Bead Chips, for comparability in a biological system in which changes may be subtle, namely fetal lung tissue as a function of gestational age. Results: We performed the comparison via sequence-based probe matching between the two platforms. "Significance grouping" was defined as a measure of comparability. Using both expression correlation and significance grouping as measures of comparability, we demonstrated that despite overall cross-platform differences at the single gene level, increased correlation between the two platforms was found in genes with higher expression level, higher probe overlap, and lower p-value. We also demonstrated that biological function as determined via KEGG pathways or GO categories is more consistent across platforms than single gene analysis. Conclusion: We conclude that while the comparability of the platforms at the single gene level may be increased by increasing sample size, they are highly comparable ontologically even for subtle differences in a relatively small sample size. Biologically relevant inference should therefore be reproducible across laboratories using different platforms. - Background The rapid development of microarray technologies has resulted in numerous microarray platforms that are analyzed using different protocols across laboratories. Most recently, microarrays by Affymetrix and Illumina have become widely used. While both platforms rely on DNA oligonucleotides as probes, they are fundamentally different in hybridization technology and data preprocessing protocols. Affymetrix arrays use in situ synthesis of 25mer oligonucleotides while Illumina arrays are based on microbeads which self-assemble onto the array. Each Affymetrix probe is therefore hybridized to a predefined location [1] while the location of each probe on the Illumina array has to be determined using a molecular address [2]. Aside from physical differences, the two platforms also differ in the way in which probes are designed. In general, while Affymetrix uses multiple 25-mer probes for each gene, Illumina uses, on average, 30 copies of the same 50-mer probe (bead-type) for each gene. Finally, while Affymetrix arrays are processed individually, Illumina arrays contain multiple arrays on a single chip, thus allowing for parallel processing. These differences have resulted in challenges in comparing data sets across platforms and across laboratories using different platforms. A number of prior studies have been done in an attempt to evaluate the comparability of these and other microarray platforms [3-6]. These studies have mainly focused on comparing two very different samples such as different tissues [3,5], tumors [4], and treatment effects on tumors [6]. In this paper, we perform a cross-platform comparison on a single tissue type over time, namely, fetal lung tissue as a function of gestational age. The sample group used in this study is more closely related to experimental settings in which the differences among groups are not large, hence we do not expect large differences in expression among samples. However, this allows us to evaluate the robustness of the effects of different factors on crossplatform comparability in the presence of subtle differences among samples. To do so, we perform both statistical and functional analyses to evaluate for statistical comparisons, as well as, biologically relevant effects. We found that the correlation between the Affymetrix and Illumina platforms at the individual gene level is related to expression level, probe overlap, and p-value ranking within each platform and that the comparability is further improved when considered on a gene-set level using GO categories and KEGG pathways. Results Performing probe matching reduces the discrepancy between Affymetrix and Illumina platforms I (...truncated)


This is a preview of a remote PDF: http://www.biomedcentral.com/content/pdf/1471-2105-10-189.pdf

Rose Du, Kelan Tantisira, Vincent Carey, Soumyaroop Bhattacharya, Stephanie Metje, Alvin T Kho, Barbara J Klanderman, Roger Gaedigk, Ross Lazarus, Thomas J Mariani, J Steven Leeder, Scott T Weiss. Platform dependence of inference on gene-wise and gene-set involvement in human lung development, BMC Bioinformatics, 2009, pp. 189, 10, DOI: 10.1186/1471-2105-10-189