Intensity‐based analysis of two‐colour microarrays enables efficient and flexible hybridization designs (pdf)

Article PDF cannot be displayed. You can download it here:

https://nar.oxfordjournals.org/content/32/4/e41.full.pdf

Intensity‐based analysis of two‐colour microarrays enables efficient and flexible hybridization designs

Published online February 24, 2004 Nucleic Acids Research, 2004, Vol. 32, No. 4 e41 DOI: 10.1093/nar/gnh038 Intensity-based analysis of two-colour microarrays enables ef®cient and ¯exible hybridization designs Peter A. C. 't Hoen1,*, Rolf Turk1, Judith M. Boer1, Ellen Sterrenburg1, ReneÂe X. de Menezes3, Gert-Jan B. van Ommen1 and Johan T. den Dunnen1,2 1 Center for Human and Clinical Genetics, 2Leiden Genome Technology Center and 3Department of Medical Statistics, Leiden University Medical Center, Wassenaarseweg 72, 2333 AL Leiden, The Netherlands Received December 17, 2003; Revised January 30, 2004; Accepted January 31, 2004 ABSTRACT INTRODUCTION DNA microarrays are widely used to measure genome-wide changes in mRNA expression levels across conditions such as developmental stages, disease states, drug treatment and gene disruption (1±5). Affymetrix GeneChips, prepared by photolithography, and spotted cDNA and 50±70mer oligonucleotide microarrays are currently the most frequently used platforms. The GeneChip is a one-colour system based on the immuno¯uorescent detection of biotinylated nucleic acids. The *To whom correspondence should be addressed. Tel: +31 71 527 6611; Fax: +31 71 527 6075; Email: Nucleic Acids Research, Vol. 32 No. 4 ã Oxford University Press 2004; all rights reserved In two-colour microarrays, the ratio of signal intensities of two co-hybridized samples is used as a relative measure of gene expression. Ratio-based analysis becomes complicated and inef®cient in multi-class comparisons. We therefore investigated the validity of an intensity-based analysis procedure. To this end, two different cRNA targets were hybridized together, separately, with a common reference and in a self±self fashion on spotted 65mer oligonucleotide microarrays. We found that the signal intensity of the cRNA targets was not in¯uenced by the presence of a target labelled in the opposite colour. This indicates that targets do not compete for binding sites on the array, which is essential for intensity-based analysis. It is demonstrated that, for good-quality arrays, the correlation of signal intensity measurements between the different hybridization designs is high (R > 0.9). Furthermore, ratio calculations from ratio- and intensity-based analyses correlated well (R > 0.8). Based on these results, we advocate the use of separate intensities rather than ratios in the analysis of two-colour long-oligonucleotide microarrays. Intensity-based analysis makes microarray experiments more ef®cient and more ¯exible: It allows for direct comparisons between all hybridized samples, while circumventing the need for a reference sample that occupies half of the hybridization capacity. difference in perfect and mismatch probe intensities is used for gene expression measurements (6). Spotted microarrays are commonly hybridized with two samples labelled with two different ¯uorophores. For these arrays, the ratio of the signal intensities in the two channels is a relative measure of gene expression. Normalization is essential to remove systematic biases in microarray data. For two-colour arrays, normalization algorithms can be applied to (log-transformed) ratios (7) (e.g. using a LOWESS algorithm). Alternatively, ANOVA models that account for array, dye and spot effects can be applied to the individual signal intensities on all the arrays (8). In both cases, after normalization, the ratio of the cohybridized samples is usually calculated to minimize the in¯uence of spatial variation in spot morphology and hybridization ef®ciency on the experimental outcome. Furthermore, some suggest that ratio-based analysis is important because of possible competitive hybridization of the two targets due to saturation of binding sites on the array (9). Ratio-based analysis can be applied to experiments with a reference or loop design (10,11). A disadvantage of the reference design is that half of the acquired data represent only one sample that is often not biologically relevant, thereby doubling the number of arrays required (10,11). A loop design has other disadvantages (11). The calculated ratios have variable levels of precision since some samples are more directly related than others, and the set of hybridizations cannot be extended. This has important implications for studies in which not all samples become available at the same time; new samples could only be included in the experiment via forming new subloops, and only if biological material from the earlier samples is still available. An intensity-based analysis in which the signal intensities in the two channels are kept separately, also after normalization, would allow for hybridization designs that are more ef®cient than the reference design and more ¯exible than the loop design. We designed a set of experiments to determine whether an intensity-based analysis would be justi®ed for our spotted long-oligonucleotide microarrays. Our aims are 2fold: ®rst, to investigate whether hybridization patterns are suf®ciently uniform across arrays; secondly, to verify if there is evidence for competition between targets for binding sites e41 Nucleic Acids Research, 2004, Vol. 32, No. 4 PAGE 2 OF 6 on the array. We run two parallel statistical analyses, one ratio-based and the other one intensity-based, and compare their results. Table 1. Overview of used Hyb-designs and ratio calculations MATERIALS AND METHODS ComRef Hyb Design Array Cy3 Cy5 Ratio CoHyb 1 2 3 4 5 6 7 8 9 10 11 12 A B A REF B REF A ± B ± A B B A REF A REF B ± B ± A A B R1 R2 R3 R4 R5 R6 Microarray and target preparation Feature extraction and data analysis Feature extraction was performed with GenePix 3.0 software (Axon Instruments Inc.). Spots with intensities lower than background or aberrant spot shape were ¯agged by the software and checked manually. Only spots that were not ¯agged on any of the analysed arrays were taken into account in further analyses, leaving 2224 data points per array. Local background-subtracted median signal intensities were used as intensity measures. Scaled gene expression ratios in samples A and B were calculated after transformation (natural logarithm) of the background-corrected intensities and subtraction of the average of the LN-transformed intensities (linear scaling). OneColour SelfSelf R7 R8 R1±R8 are calculated from scaled LN-transformed background-subtracted intensities. Average ratios are then calculated according to: LN(RatioCoHyb): 0.5*[LN(R1) ± LN(R2)] LN(RatioComRef): 0.5*[LN(R5) ± LN(R6)] ± 0.5*[LN(R3) ± LN(R4)] LN(RatioOneColour): 0.5*[LN(R8) ± LN(R7)]. each individual target (A and B) separately. F-statistics and corresponding p values are based upon the F2 statistic available in the MAANOVA package, which is a shrunk version of the classic F-statistic. To avoid distributional assumptions, the package offers the possibility of computing p values for hypothesis tests via permutation methods. We have chosen to perfo (...truncated)