Relation between smoking history and gene expression profiles in lung adenocarcinomas
Johan Staaf
0
Gran Jnsson
0
Mats Jnsson
0
Anna Karlsson
0
Sofi Isaksson
0
Annette Salomonsson
0
Helen M. Pettersson
Maria Soller
Sven-Brje Ewers
0
Leif Johansson
Per Jnsson
Maria Planck
0
0
Department of Oncology, Clinical Sciences, Lund University and Skane University Hospital
,
Barngatan 2:1, SE-22185 Lund
,
Sweden
Background: Lung cancer is the worldwide leading cause of death from cancer. Tobacco usage is the major pathogenic factor, but all lung cancers are not attributable to smoking. Specifically, lung cancer in never-smokers has been suggested to represent a distinct disease entity compared to lung cancer arising in smokers due to differences in etiology, natural history and response to specific treatment regimes. However, the genetic aberrations that differ between smokers and never-smokers' lung carcinomas remain to a large extent unclear. Methods: Unsupervised gene expression analysis of 39 primary lung adenocarcinomas was performed using Illumina HT-12 microarrays. Results from unsupervised analysis were validated in six external adenocarcinoma data sets (n=687), and six data sets comprising normal airway epithelial or normal lung tissue specimens (n=467). Supervised gene expression analysis between smokers and never-smokers were performed in seven adenocarcinoma data sets, and results validated in the six normal data sets. Results: Initial unsupervised analysis of 39 adenocarcinomas identified two subgroups of which one harbored all never-smokers. A generated gene expression signature could subsequently identify never-smokers with 79-100% sensitivity in external adenocarcinoma data sets and with 76-88% sensitivity in the normal materials. A notable fraction of current/former smokers were grouped with never-smokers. Intriguingly, supervised analysis of never-smokers versus smokers in seven adenocarcinoma data sets generated similar results. Overlap in classification between the two approaches was high, indicating that both approaches identify a common set of samples from current/former smokers as potential never-smokers. The gene signature from unsupervised analysis included several genes implicated in lung tumorigenesis, immune-response associated pathways, genes previously associated with smoking, as well as marker genes for alveolar type II pneumocytes, while the best classifier from supervised analysis comprised genes strongly associated with proliferation, but also genes previously associated with smoking. Conclusions: Based on gene expression profiling, we demonstrate that never-smokers can be identified with high sensitivity in both tumor material and normal airway epithelial specimens. Our results indicate that tumors arising in never-smokers, together with a subset of tumors from smokers, represent a distinct entity of lung adenocarcinomas. Taken together, these analyses provide further insight into the transcriptional patterns occurring in lung adenocarcinoma stratified by smoking history.
-
Background
Due to high incidence and poor survival, lung cancer is
the worldwide leading cause of death from cancer. Small
cell lung cancer accounts for about 15% of all lung
cancer diagnoses whereas non-small cell lung cancer
constitutes the majority of cases, primarily including
adenocarcinoma (AC) and squamous cell carcinoma.
Although the use of cigarettes is the major pathogenic
factor, not all cases of lung cancer can be attributable to
smoking [1]. Lung cancer in never-smokers has been
suggested to represent a different disease entity
compared to lung cancer arising in smokers [2, 3].
Specifically, lung cancer in never-smokers has been associated
with female sex, East Asian ethnicity, AC histology,
differences in mutational pattern of EGFR, KRAS, and
TP53, and response to EGFR inhibitors [24]. However,
despite numerous reports of gene expression derived AC
subtypes [510], a distinct subtype comprising only or
predominantly of never-smokers has not been identified.
Taken together, this warrants further investigation of the
transcriptional differences between AC arising in
neversmokers and smokers.
In the present study, we aimed to delineate
transcriptional differences between never-smokers and current/
former smokers with AC by both unsupervised and
supervised gene expression analysis, combined with
conventional molecular assays, measurements of pathway
activation by different gene expression metagenes, and
histopathological data, across several AC data sets.
Methods
Ethics statement
The study was approved by the Regional Ethical Review
Board in Lund, Sweden (Registration no. 2004/762 and
2008/702). Written informed consent was obtained from
all patients diagnosed after 2004, whereas for the
retrospective part of the material, i.e. patients diagnosed
earlier than 2004, study inclusion was approved by the
Regional Ethical Review Board in Lund, Sweden, if
patients (or their family members/survivors) not stated
otherwise when they were informed about the study in
2006.
Patient material
39 AC were obtained from patients selected for surgery
of early stage, primary lung cancer between 19892007
at the Skne University Hospital, Sweden. Smoking
history was obtained from patient charts and included 13
current smokers, 14 former smokers, and 10
neversmokers. Among the former smokers four patients quit
smoking less than one year before surgery. None of the
patients had received neoadjuvant treatment prior to
surgery. Within an hour after lobectomy/pulmectomy, a
biopsy from a macroscopically representative area of the
tumor was selected by a lung cancer pathologist (most
often LJ) and freshly frozen in 80 C. DNA and RNA
were subsequently extracted from the freshly frozen
specimens, according to published protocols [11]. Tumor
histology of all original tumor blocks was confirmed by
a lung cancer pathologist (LJ). With the exception of
one node positive (N1) and one with non-evaluable
Nstatus, all cases were T1-4N0M0. Clinical and
histopathological data are summarized in Table 1.
External lung AC expression data sets
The DCC [12] (n = 444, Affymetrix U133A), GSE10072
[13] (n = 58, Affymetrix U133A), GSE12667 [14]
(n = 75, Affymetrix U133 2 plus), Beer et al. [7] (n = 86,
Affymetrix HU6800), GSE32863 (n = 58, Illumina WG6
version 3), and GSE11969 [9] (n = 158 including 90 AC,
Agilent GPL7015) gene expression data sets were used for
supervised analysis and to validate the gene signature
derived from unsupervised analysis. The GSE7895 [15]
(n = 104, Affymetrix U133A), GSE19027 [16] (n = 52,
Affymetrix U133A), GSE19667 [17] (n = 121, Affymetrix U133
2 plus), GSE11952 [18] (n = 83, Affymetrix U133 2 plus),
GSE32863 (n = 58, Illumina WG6 version 3), and
GSE10072 (n = 49, Affymetrix U133A) data sets were used
to investigate the gene signature from unsupervised analysis
in histologically normal bronchial airway epithelial cells or
normal adjacent lung tissue (GSE32863 and GSE10072).
Only probe sets present on the U133A chip were used for
U133 2 plus arrays in all analys (...truncated)