Limits of Predictive Models Using Microarray Data for Breast Cancer Clinical Treatment Outcome

JNCI Journal of the National Cancer Institute, Jun 2005

Data from microarray studies have been used to develop predictive models for treatment outcome in breast cancer, such as a recently proposed predictive model for antiestrogen response after tamoxifen treatment that was based on the expression ratio of two genes. We attempted to validate this model on an independent cohort of 58 patients with resectable estrogen receptor–positive breast cancer. We measured expression of the genes HOXB13 and IL17BR with real time–quantitative polymerase chain reaction and assessed the association between their expression and outcome by use of univariate logistic regression, area under the receiver-operating-characteristic curve (AUC), a two-sample t test, and a Mann–Whitney test. We also applied standard supervised methods to the original microarray dataset and to another independent dataset from similar patients to estimate the classification accuracy obtainable by using more than two genes in a microarray-based predictive model. We could not validate the performance of the two-gene predictor on our cohort of samples (relation between outcome and the following genes estimated by logistic regression: for HOXB13, odds ratio [OR] = 1.04, 95% confidence interval [CI] = 0.92 to 1.16, P = .54; for IL17BR, OR = 0.69, 95% CI = 0.40 to 1.20, P = .18; and for HOXB13/IL17BR, OR = 1.30, 95% CI = 0.88 to 1.93, P = .18). Similar results were obtained with the AUC, a two-sample two-sided t test, and a Mann–Whitney test. In addition, estimates of classification accuracies applied to two independent microarray datasets highlighted the poor performance of treatment-response predictive models that can be achieved with the sample sizes of patients and informative genes to date.

Article PDF cannot be displayed. You can download it here:

https://jnci.oxfordjournals.org/content/97/12/927.full.pdf

Limits of Predictive Models Using Microarray Data for Breast Cancer Clinical Treatment Outcome

James F. Reid james.reid@ifom-ieo- 0 1 Lara Lusa 0 1 Loris De Cecco 0 1 Danila Coradini 0 1 Silvia Veneroni 0 1 Maria Grazia Daidone 0 1 Manuela Gariboldi ) 0 1 Marco A. Pierotti ) 0 1 0 Journal of the National Cancer Institute , Vol. 97, No. 12, June 15, 2005 1 Affiliations of authors: Department of Ex- perimental Oncology, Istituto Nazionale per lo Studio e la Cura dei Tumori , Milano, Italy (JFR, LL, LDC, DC, SV, MGD, MG , MAP); Molecular Cancer Genetics Group, Fondazione Istituto FIRC di Oncologia Molecolare (IFOM) , Milano, Italy (JFR, LL, LDC, MG , MAP). Istituto FIRC di Oncologia Molecolare (IFOM) , Milano , Italy ( - Data from microarray studies have been used to develop predictive models for treatment outcome in breast cancer, such as a recently proposed predictive model for antiestrogen response after tamoxifen treatment that was based on the expression ratio of two genes. We attempted to validate this model on an independent cohort of 58 patients with resectable estrogen receptor positive breast cancer. We measured expression of the genes HOXB13 and IL17BR with real timequantitative polymerase chain reaction and assessed the association between their expression and outcome by use of univariate logistic regression, area under the receiver-operating-characteristic curve (AUC), a two-sample t test, and a MannWhitney test. We also applied standard supervised methods to the original microarray dataset and to another independent dataset from similar patients to estimate the classification accuracy obtainable by using more than two genes in a microarray-based predictive model. We could not validate the performance of the two-gene predictor on our cohort of samples (relation between outcome and the following genes estimated by logistic regression: for HOXB13, odds ratio [OR] = 1.04, 95% confidence interval [CI] = 0.92 to 1.16, P = .54; for IL17BR, OR = 0.69, 95% CI = 0.40 to 1.20, P = .18; and for HOXB13/IL17BR, OR = 1.30, 95% CI = 0.88 to 1.93, P = .18). Similar results were obtained with the AUC, a two-sample two-sided t test, and a MannWhitney test. In addition, estimates of classification accuracies applied to two independent microarray datasets highlighted the poor performance of treatment-response predictive models that can be achieved with the sample sizes of patients and informative genes to date. [J Natl Cancer Inst 2005;97:92730] Several studies have demonstrated that breast cancers with distinct pathologic features can be recognized by their gene expression profile (111). Microarrays have been used to identify expression patterns capable of predicting outcome or response after specific treatments such as tamoxifen, which is a standard adjuvant treatment for patients with primary, estrogen receptorpositive breast cancer (12,13). Currently, many patients do not respond to treatment, and so additional biomarkers predictive of treatment failure within endocrineresponsive diseases are required. Recently, a tamoxifen-response predictive model consisting of only two genes has been described (14). By using microarray gene expression profiles of 60 tamoxifen-treated patients, HOXB13 and IL17BR were identified as the two genes whose expression ratio predicts clinical outcome. This finding was validated by use of real timequantitative polymerase chain reaction (RT-QPCR) on an independent set of 20 formalinfixed, paraffin-embedded samples by correctly classifying the outcomes of 16 patients (P = .01). However, by considering the data from relapsed and diseasefree patients separately, although the probability of obtaining such a correct classification by chance remained low for disease-free patients (nine of 10 correctly classified, P = .02; 95% confidence interval [CI] for the proportion of correctly classified samples = 0.55 to 0.99), this estimate increased drastically for relapsed patients (seven of 10 correctly classified, P = .34; 95% CI = 0.35 to 0.93). Although the proposed predictive model is very appealing from clinical and practical points of view because of its potential straightforward application in many laboratories, the results of the validation set (i.e., the statistically nonsignificant results for the relapsed patients) indicate that a larger validation set is required. For this reason, we applied this twogene predictive model for relapse to a dataset derived from a cohort of 58 patientswithearly-stage,estrogen receptor positive primary breast cancer who were treated at the Istituto Nazionale Tumori between March 1, 1991, and December 31, 1997, with radical or conservative surgery plus radiotherapy followed by adjuvant monotherapy with tamoxifen (median treatment duration = 60 months, range = 2784 months). All patients signed an informed consent to donate any tissue leftover after diagnostic procedures to Istituto Nazionale Tumori. A tumor was classified as estrogen receptor positive if the ligand binding assay detected more than 10 fmol of estrogen bound per mg of total protein. Disease recurred with distant metastasis in 18 patients (16 patients as a first event and two as a second event after local-regional recurrence) of the 58 patients within a median time of 31 months (range = 14 43 months) from surgery. Forty of the 58 patients were disease free after a median time of 93 months (range = 70125 months). Clinical and pathobiologic details of these 58 patients are presented in supplemental Table 1 (Available at: http:// jncicancerspectrum.oupjournals.org/jnci/ content/vol97/issue12). Most patients were older than 50 years of age (93.1%) and had lymph nodepositive disease (77.5%; 53.5% had one to three positive lymph nodes and 24.0% had more than three positive lymph nodes). Their tumors were larger than 2 cm (62.1% of tumors), were progesterone receptor positive (79.3% of tumors; i.e., more than 25 fmol of progesterone bound per mg of total protein by ligand binding assay), and were HER-2/neu negative (77.6% of tumors). HER-2/neu status was immunohistochemically assessed with polyclonal antibody against p185HER2 protein (1:2000 dilution, DAKO, Milan, Italy) and defined as positive when strong membrane labeling was observed. A limitation of any validation study on independent cohorts can be related to having a different mixture of case patients than that of the original study. Compared with the previously described cohort (14), our cohort had a prevalence of tumors that were lymph node positive (77.5% vs. 47.2%), HER2/neu positive (20.7% vs. 5.4%), and larger than 2 cm (62.1% vs. 47.2%). RT-QPCR used TaqMan gene expression assays for the following genes: HOXB13 labeled with FAM-MGB (a 6carboxyfluorescein fluorescent dye and a minor groove binding [MGB] molecule attached to the 3 end, which stabilizes the probe annealing; product Hs00197189), IL17BR labeled with FAM-MGB (product Hs00218889), and human GAPDH VIC-MGB (VIC is a proprietary fluorescent dye; product 4326317E), a housekeeping gene used for normalization (Applied Biosystems, Foste (...truncated)


This is a preview of a remote PDF: https://jnci.oxfordjournals.org/content/97/12/927.full.pdf
Article home page: http://jnci.oxfordjournals.org/content/97/12/927.abstract

James F. Reid, Lara Lusa, Loris De Cecco, Danila Coradini, Silvia Veneroni, Maria Grazia Daidone, Manuela Gariboldi, Marco A. Pierotti. Limits of Predictive Models Using Microarray Data for Breast Cancer Clinical Treatment Outcome, JNCI Journal of the National Cancer Institute, 2005, pp. 927-930, 97/12, DOI: 10.1093/jnci/dji153