Targeted sequencing of chromosome 15q25 identified novel variants associated with risk of lung cancer and smoking behavior in Chinese
Targeted sequencing of chromosome 15q25 identified novel variants associated with risk of lung cancer and smoking behavior in Chinese
Yang Cheng 2
Cheng Wang 2
Meng Zhu 2
Juncheng Dai 1 2
Yuzhuo Wang 2
Liguo Geng 2
Zhihua Li 2
Jiahui Zhang 2
Hongxia Ma 1 2
Guangfu Jin 1 2
Dongxin Lin 0
Zhibin Hu 1 2
Hongbing Shen 1 2
0 State Key Laboratory of Molecular Oncology, Cancer Institute and Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College , Beijing 100021 , China
1 Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center of Cancer Medicine, Nanjing Medical University , Nanjing 211166 , China
2 Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University , Nanjing 211166 , China
Previous genome-wide association studies (GWAS) in populations of European descent identified a lung cancer susceptibility locus at 15q25 that was biologically associated with nicotine addiction. However, the allele frequency of susceptibility variants identified in this region varied dramatically across European and Asian populations, suggesting that additional risk single nucleotide polymorphism (SNPs) in Asians need to be identified. Thus, we conducted a fine-mapping study of chromosome 15q25 using targeted resequencing of 200 lung cancer cases and 300 controls of Chinese descent. An approximate conditional and joint analysis of the discovery data revealed two novel SNPs with independent effects (rs6495304: OR = 1.79, P = 9.37 × 10−4; and rs74733525: OR = 1.68,P = 8.05 × 10−3). Both variants were common in Asians but rare or monomorphic in Whites. These results were further supportedinbysilico validation including 8047 cases and 8898 controls from multiethnic lung cancer genome-wide association studies (GWASs) (rs6495304: OR = 1P.3 =2 ,1.21 × 10-11; and rs74733525: OR = 1.29, P = 4.29 × 10−4); however, rs6495304 demonstrated significant effects only in ever-smokerPs =( 0.004 for heterogeneity test of smoking). Mediation analysis indicated that smoking behavior may mediate the effect of rs6495304 on lung cancer risk. Furthermore, expression quantitative trait loci analysis showed the risk allele (A) of rs6495304 was significantly associated with lower mRNA expression CoHfRNA3 (P = 0.029) in 81 hypothalamic tissue samples. This finding provides new insights into the association between lung cancer susceptibility and the 15q25 locus.
Lung cancer is the leading cause of cancer-related deaths in the world’s leading tobacco producing and consuming country,
world, with more than one million deaths annual1ly). I(n 2015, China is home to one-third of all smokers globally. In 2010, it was
estimated annual lung cancer deaths in China had increased estimated that there were 301 million current smokers in China
to 610200 (2). Smoking is the major risk factor for lung cancer, and that 85.6% of them smoked daily4(). Although new,
efficarelating to approximately 90% of lung cancer case3s). (As the cious techniques for smoking cessation have helped to reduce
Abbreviations disease conducted in Jiangsu Province during the same period
GWAS genome-wide association studies when the cases were recruited. Lung cancer patients who had
MAF minor allele frequency a history of cancer, metastasized cancer from other organs,
nAChR nicotinic acetylcholine receptor subunit radiotherapy or chemotherapy were excluded. Controls were
SNP single nucleotide polymorphism frequency-matched to the cases for age (±5 years) and sex. All
subjects were genetically unrelated and of Han Chinese descent.
the number of smokers significantly, less than 5% succeed. Of Participants were face-to-face interviewed by trained intervi-ew
these smokers, approximately 60% are addicted to nicotin5e).( ers to collect information on demographic data and provided
Many aspects of cigarette-smoking behavior cluster in an approximately 5 ml venous blood sample. Individuals were
families (6). Evidence from twin and family studies revealed defined as smokers if they had smoked an average of one cig-a
that genetic factors are important to the etiology of nicotinerette or more per day for at least 1 year in their lifetime; o-ther
dependence, with an estimated heritability of 0.567)(. The risk wise, they were defined as nonsmokers.
of developing nicotine dependence for the sibling of a nicotine- In the replication stage, samples were obtained from our
dependent individual is twice that of the general populatio6n). ( published NJMU GWAS data (Nanjing Medical University GWAS
A cluster of three genesC,HRNA5, CHRNA3 and CHRNB4 on from Nanjing and Beijing: 2331 cases and 3077 controls1)7(,18)
chromosome 15q25 encode neuronal nicotinic acetylcholine and NCI GWASs (the National Cancer Institute GWASs: 5716
receptor subunits (nAChRs). nAChRs belong to the super fa-m cases and 5821 controls) 1(9). Detailed information about the
ily of ligand-gated ion channels that can mediate fast signal subjects involved in this study is shown (see Supplementary
transmission at synapses and modulate the release of several Table 1, available atCarcinogenesis Online). NCI GWASs obtained
neurotransmitters. They are also the initial physiological- tar via the database of Genotypes and Phenotypes (dbGAP)
gets of nicotine in the central and peripheral nervous system. included samples from four studies: (i) the Environment and
Within a few seconds of smoking, nicotine is delivered to the Genetics in Lung Cancer Etiology (EAGLE) study; (ii) the
Alphasynapses where these receptors are expressed to initiate ni-co Tocopherol, Beta-Carotene Cancer Prevention Study (ATBC); (iii)
tine addiction 8(). Consistent with an important role for nAChRs the Cancer Prevention Study II Nutrition Cohort (CPSII) and (iv)
in regulating nicotine intake, knockdown oαf5 nAChR subunits the Prostate, Lung, Colon, Ovary Screening Trial (PLCO19))(. We
in the brain region in rats decreased their sensitivity to reward- obtained EAGLE study data from dbGaP phs000093.v2.p2, which
inhibiting actions compared with that of control ra9t).s ( included 1945 cases and 1992 controls (SLD) and the other three
Recent genome-wide association studies (GWASs)1(0–12) in studies data from dbGaP phs000336.v1.p1, which included 3782
Caucasian populations identified three single nucleotide p-oly cases and 3840 controls (CADM). Replication stage analyses
morphism (SNPs) (rs8034191, rs1051730 and rs16969968) at this were conducted separately with subsequent meta-analysis.
locus that were associated with lung cancer risk. It was -sug
gested that these SNPs might exert their effects on lung cancer Quality control and imputation of GWAS data sets
through a nicotine-related pathway. However, it is worth noting The NJMU GWAS contained 2331 cases and 3077 controls was
that the allelic frequencies of the variants identified were low conducted using an Affymetrix Genome-Wide Human SNP
in Asians (minor allele frequency, MAF < 0.05), and no assoc-ia Array 6.0 with standard quality control procedures described in
tion was found in relation to smoking behaviors or lung cancer previous papers (17,18). Although the NCI GWASs from dbGaP
risk in Chinese populations 1(3). Our previous work identified were not deposited until after the initial quality control, we
four novel genetic variants at this locus using haplotype-tagging performed standard quality control on the data. We excluded
strategy1(3), but may have missed variants outside the selected individuals with low call rates (95%), familial relationships and
linkage disequilibrium (LD) block. Targeted resequencing, which extreme heterozygosity rates and SNPs with low call rates (95%),
is frequently used in fine-mapping studies, can provide a com- MAFs < 0.05 and P < 1 × 10–6 for the Hardy–Weinberg equili-b
prehensive association map 1(4–16). rium. As a result, 1937 cases and 1984 controls from SLD and
In this study, we aimed to identify novel genetic variations at 3779 cases and 3837 controls from CADM remained. To facilitate
15q25 associated with lung cancer risk in the Han Chinese po-p further analysis, the genotyping data were imputed using data
ulation. We conducted a two-stage fine-mapping study, which from the 1000 Genomes Project (the Phase III integrated va-ri
consisted of a discovery cohort with 200 lung cancer cases and ant set release, across 2504 samples) as a reference. We phased
300 controls and anin silico validation cohort with 8047 cases the haplotypes with Shapeit v2h(ttp://www.shapeit.fr/,Phasing
and 8898 controls. step) and performed imputations with
IMPUTE2h(ttp://mathgen.stats.ox.ac.uk/impute/impute_v2.htm).l Poorly imputed
SNPs, defined by an information measure (Is) < 0.40, were fu-r
Materials and methods ther excluded from the analysis.
Targeted re-sequencing and genotyping
We performed a two-stage case–control analysis. It was approved We explored the LD structure around the SNP rs1051730, as
by the Institutional Review Board of Nanjing Medical University. reported by GWASs 1(0–12), using the HapMap Project dat-a
Informed consent was obtained from each subject at recru-it base (phase II + III Feb 09, on NCBI B36 assembly, db SNP126).
ment. In the discovery stage, 200 lung cancer cases and 300 We identified an LD block spanning from chr15:78800300 to
controls were identified for targeted resequencing; these in-di chr15:78983700 (hg19) in our study. In total, 210 probes (total
viduals were also included in our previous GWAS data17(,18). size: 183 kb; coverage: 76.19%) were designed using Agilent
Cases were defined as newly diagnosed lung cancer patients SureSelect software h(ttp://earray.chem.agilent.com/eArr)a.y
and consecutively recruited from the Cancer Hospital of Jiangsu Genomic DNA was captured according to the standard Agilent
Province and the First Affiliated Hospital of Nanjing Medical SureSelect protocol (Agilent Technologies, Santa Clara, CA,
University beginning in 2003. Controls were randomly selected USA), and associated captured libraries were sequenced on the
from a community-based screening program for noninfectious Genome Analyzer IIx (Illumina, San Diego, CA, USA)2(0). After
removing reads containing sequencing adapters and low-qu-al address the case–control design by fitting the mediator model
ity reads using the FASTQ Quality Filter tool, high-quality reads only among controls2(8). PC was used as a measure of smoking
were aligned to the human reference genome version hg19 behavior. For each SNP, the mediation analyses were based on
using Burrows-Wheeler Aligner (BWA, V.0.5.9)h(ttp://bio-bwa. the following three regression modelsEq(uations 1–3). All the
source-forge.net) (21). Picard Tools h(ttp://picard.sourceforge. models were adjusted for covariates of age and sex.
net/) was used to mark duplicates, and base quality scores M = i2 + a ∗ X + e2 ∗Covariates (1)
were recalibrated using the Genome Analysis Toolkit (GATK,
v1.0.5974) from the Broad Institute22(). Finally, variant calling Y = i1 + c ∗ X + e1 ∗Covariates (2)
was performed through GATK and Freebayes2(3), and only var-i
ants identified by both tools were considered. Y = i3 + c’ ∗ X + b ∗ M + e3 ∗Covariates (3)
For quality control, we excluded 8 affected subjects and 22
control subjects because they (i) had a concordance rate <90% as Here, X, M and Y denote the genetic variants, smoking beha-v
determined by comparing genotypes against existing GWAS g-en ior and lung cancer, respectively. The termi1s–i3 represent the
otypes or (ii) yielded a read depth <1×0 across samples. A total error terms for each equation, aned1–e3 represent coefficients of
of 1385 variants were detected through targeted resequencing; covariates. In these equations, theindirect effecta, denoted as a,
of these, 1016 were excluded from subsequent analyses for the was the linear regression coefficient of the mediator on an SNP.
following reasons: (i) call rate of genotype < 90%; (Pii)value for The total effect, denoted as c, was the logistic regression coef-fi
Hardy–Weinberg equilibrium < 1× 10–4 in case, control or all su-b cient of lung cancer on the SNP without considering the med-ia
jects; (iii) MAF < 0.05 (control); (iv) concordance rate < 90% with tor. Theindirect effectb, denoted as b, was the logistic regression
the previous GWAS for overlapping variants; or (v) absence from coefficient of lung cancer on the mediator when considering the
the 1000 Genomes Project w(ww.1000genomes.org) (QC details SNP and the direct effect, denoted as c′, was the logistic regre-s
in Supplementary Figure 1, available aCtarcinogenesis Online). sion coefficient of lung cancer on the SNP when considering the
Finally, 369 common SNPs in 192 affected subjects and 278 co-n mediator. The product ofindirect effecta and indirect effectb was
trols were retained for further association analysis. termed the indirect effect. Bootstrapping was used to evaluate
the significance of theindirect effect based on a resample-based
Association analysis of variants with lung method with replacement2(9,30). The proportion mediated by
cancer risk smoking behavior is given by the ratio between thiendirect effect
In the discovery stage, associations between each variant and and thetotal effect.
lung cancer risk were calculated using logistic regression m-od
els with adjustment for age, sex and smoking status in PLINK Explained variance
1.90. Regional plots were created using LocusZoom24(). An Variants identified in our current analysis and those reported
approximate conditional and joint analysis approach using in previous studies at this locus were used to calculate the
genome-wide complex trait analysis (GCTA: Online Methods) respective variances by assuming the prevalence of lung ca-n
awnads tpherofsoerwmietdh tnoomseilneaclt einviddeexnvcaeroiafnatsssoicniaatisopnePc (<if i0c.0r5e)gwioe2nr5e() tcherentotrbaen0s.0fo6r%m(1e)d.RWo2etoobRt12aoinnedthReo2lfiraobmilitaylisnceaalre rbeygurseisnsRgiocnoadned
selected for validation. In the validation stage, association a-nal described previously 3(1).
yses of the SNPs identified were performed using SNPTEST v2.5
under a probabilistic dosage model with adjustment for age, Functional element analysis and gene expression
sex, pack-years and the first principal component (PC) in NJMU analysis
GWAS; age, sex, smoking status and the first PC in SLD GWAS; The novel SNPs were investigated for the presence of chromatin
and age, sex and the first PC in CADM GWAS. Finally, fixed-effect histone marks and hypersensitive DNAse elements using data
meta-analysis was conducted to assess the pooled genetic from ENCODE included in HaploReg v4.1
h(ttp://www.broadeffects. Cochran’s Q statistic and I2 were calculated using STATA institute.org/mammals/haploreg/haploreg.p)hapnd the UCSC
software (V.8.0, College Station, TX, USA). General analyses genome browser (https://genome.ucsc.edu/cgi-bin/hgGatewa)y.
were performed with R software (V.3.1.1; The R Foundation for The results and boxplots of the expression quantitative trait loci
Statistical Computing). analysis in the key regions of brain from the genotype-tissue
expression (GTEx) project were obtained from the GTEx Portal
Association of SNPs with smoking behavior (http://www.gtexportal.or)g./
Principal component analysis is a useful tool to search for
important characteristics among correlated variables. PrincipalResults
component analysis was performed using the pRrcomp function
with standard parameters26(). It allows the identification of Identification of new lung cancer susceptibility SNPs
latent variables [principal components (PCs)] in the data based We found several significant signals at 15q25 (see Supplementary
on observed variables (in our study, four parameters: pack- Figure 2A, available atCarcinogenesis Online). Through approx-i
years, cigarettes per day, smoking duration and smoking s-ta mate conditioning with GCTA, we identified two novel ind-e
tus). A Wilcoxon test was used to assess the impact of variants pendent SNPs significantly associated with lung cancer risk
on smoking phenotypes such as pack-years, cigarettes per day, (Table 1): rs6495304 (OR = 1.79, 95% CI: 1.27–2.50, P = 9.37 × 10–4)
smoking duration and the top PC. in the intron ofAGPHD1 and rs74733525 (OR = 1.68, 95% CI: 1.14–
2.46, P = 8.05 × 10–3) in the intron ofCHRNA3. Based on LD
analyMediation analysis sis (Supplementary Figure 2B, available aCtarcinogenesis Online),
We implemented the Baron and Kenny approach for mediation we found that rs6495304 and rs74733525 were not located in
analysis (27) to assess whether smoking behavior mediated the the previously reported haplotypesr2 (< 0.20) (Supplementary
effect of SNP on lung cancer risk, and adapted this approach to Table 2, available atCarcinogenesis Online).
Ten genetic variants have been identified by previous studies Potential regulatory mechanisms underlying the
and association analyses between these variants and lung c-an association between rs6495304 and smoking
cer risk were demonstrated (Supplementary Table 3, available behavior
SascaaodwuaNttntuighnteACdPhgecasGtrreeropsP(csrrf1Hritfosn6eariD1uonv9ii0g1srl6ig5oeki9aSnn1u.o9nNe7gsu6Bs3dP8reiri0snsserariapO(stpedrno6Csnres6rduHosl97t,lib3Rrn2st8reNs,8ess68ad2A)180w.ne5224Tasde)92thiara1g3Ceeen4n7fnHfd3rd4d.eR8eicralNrl5tsuteiACa1cansH2b5tingz9)liRdoee1cwNs.n0arWeA9wsnsr83c6eeoe4,e4rfarara9esels5Ctr8esfs3fHiso0oes0s3Rckcc9e4iNtonwa1swAntt9ebeif3e1raidre,realetrmltcwswyAoe2eiexnGt0iddech3sPnetl6iHsnulh5ftDudta3oen1ie4tuncdgrattlh,e IfsrtSaTmnoiehountuafepneearneprodkxp(deltDdepoenaditHlsmtctohiSieteberoa)idyneaontiltnn,ntalratheesnwr1hwl6ey2eee4Hl,i9pdyFgtr35oiihehKgf3itgftdeu04eeue4Mrnramlneetlawe tntai1t4aiaettfoAlrisrsrecn,iiydmclegaaoelntemvvlscaecaaaolhliiratnlitainelad(ainldSfebineiucslpsltiaepnmerastpenaoeDlduCddtfoeiNsacmfriatrantscthenis6giohnnea4neeot9tHInga5 aamechh3rnpsoy0yaoeslm4pnsodoTeicdcesRparelieaebsarsOcg3reltr3neneebie() dollsa.a6einisnsWttmseeieoa,vdede)ni.wntttodhytehnee
affinity of POU class 2 homeobox 2 (POU2F2) (Supplementary
In silico validation of rs6495304 and rs74733525 Figure 4B, available atCarcinogenesis Online), a common
tranWe further validated the associations between the rs6495304 scription factor binding site in immunoglobulin gene p-ro
and rs74733525 genetic variants and lung cancer risk usining moters. Further expression quantitative trait loci annotation
silico data from multiple lung cancer GWASs conducted in Asian suggested that the A allele of rs6495304 was associated with
and European populations. In Asian GWAS subjects, both SNPs the decreased expression of CHRNA3 in the hypothalamus
were associated with increased odds of lung cancer (rs6495304: (P = 0.029, Supplementary Figure 4C, available aCtarcinogenesis
OR = 1.35, 95% CI: 1.22–1.47, P = 5.16 × 10−10; rs74733525: Online).
OR = 1.29, 95% CI: 1.12–1.49, P = 4.29 × 10–4, Table 1). In Caucasian
GWAS subjects, rs6495304 demonstrated a weaker significance Single—SNP association with smoking behavior
(OR = 1.24, 95% CI: 1.06–1.46, P = 6.27 × 10−3), with the minor In addition, a significant multiplicative interaction was observed
allele of rs6495304 occurring at a much lower frequency; and between rs6495304 and smoking behavior P( = 0.004, Table 3). We
rs74733525 was monomorphic T(able 1). We further queried the further repeated the interaction analysis with smoking using
associations within never-smoking women in Asia3(2); within the pooled SLD and NJMU GWAS data sets, and obtained sim- i
this population, these variants did not show any significance lar but more pronounced statistically significanPc e= (1.59 × 10−6,
(data not shown). Supplementary Table 4, available at Carcinogenesis Online). We
Stratification analyses were performed by using only NJMU next examined the association between rs6495304 and smoking
GWAS data because both SNPs were rare or monomorphic in behavior in all 5408 subjects. The frequency of the A allele was
European populations. We observed that the risk associated higher in heavier smokers and subjects with longer smoking
with rs6495304 was significantly different between the s-ub duration (consume > 48 pack-years, >25 cigarettes per day, PC > 35
groups by gender and smoking status (test for heteroge-ne and smoke > 40 years, Table 4). Further analysis suggested that
ity, P = 2.00 × 10−3 and 4.30 × 10−3, respectively), but not the carriers of A allele (adverse) in control subjects who were current
subgroups by histology for either variantP (= 0.50 and 0.89, smokers (n = 1083) were more likely to be heavy smokers
(packTable 2). However, we still found that rs6495304 was sign-ifi years: P = 0.043, smoking duration:P = 0.007, and PC: P = 0.028)
cantly associated with different subtypes of non-small cell (Supplementary Figure 3, available aCtarcinogenesis Online).
lung cancer and rs74733525 was only significantly associated
with adenocarcinoma, further confirming that these variants Mediation analysis
in the CHRN genes have different effects in different subtypes To provide further evidence that smoking behavior was a med-ia
of lung cancer. tor of the association between rs6495304 and lung cancer, we
AC, adenocarcinoma; SC, squamous cell carcinoma; SCC, small cell carcinoma; other, includes large cell lung cancer and mixed cell carcinoma.
aWild-type homozygote/heterozygote/variant homozygote.
bAdjusted by age, gender and pack-years of smoking where appropriate.
cP for heterogeneity.
aCategorized using the 25th and 75th percentile pack-years, cigarette consumption, years smoked and first principal component derived from smoking variables.
conducted mediation analysisF(igure 1). In our model, the total the PC and the effect of the PC on lung cancer risk were also
effect of the SNP on lung cancer risk was statistically significant significant a( = 1.672, SE = 0.714, P = 0.019, b = 0.014, SE = 0.001,
(c = 0.302, SE = 0.048, P < 0.001). Both the effect of the SNP on P < 0.001). Bootstrapping analysis suggested that smoking
evolution patterns3(6). Refining this region may help to identify
association signals that are more likely to be population specific.
The two variants identified in our study varied greatly in allelic
frequencies among different ethnicities. Rs6495304 is common
in Han Chinese in Beijing (CHB: MAF = 0.24) but rare in Centre
d’Etude du Polymorphisme Humain Utah (CEU: MAF = 0.01)
according to data from the 1000 Genomes Project. Rs74733525 is
common in CHB (MAF = 0.08) and monomorphic in CEU.
In addition, studies have investigated the mediation effect of
smoking behavior on the relationship between variants at the
15q25 region and lung cancer risk with little consensus as to the
relative impact of the variants on the propensity for smoking
or a direct carcinogenic effect. VanderWeeleet al. (37) found no
evidence that theCHRNA5-A3-B4 was associated with lung ca-n
cer, in interaction with smoking, whereas others found that this
region indirectly acts on lung cancer risk through the mediation
Faicgquurier e1d. Mfreodmiatsimonokainnaglysstiast:ugsr,apchigiacareltrteepsr/desaeyn,tsamtiooknin.gSmdoukriantgiobnehaanvdiorpa(PcCk:- of smoking behavior (38). In our study, we demonstrated that
years) mediation model of the relationship between rs6495304 and lung cancer, smoking was a mediator of the association between rs6495304
adjusted by age and sex. In the model, the percentage of the effect mediated by and lung cancer risk, which accounts for only 7.62% of the total
smoking behavior is ~7.62% of the total effect of rs6495304 on lung cancer risk. effect of rs6495304 on lung cancer risk. We further confirmed the
*P < 0.05; **P < 0.001. mediating effect by examining the effect among never-smokers
(P = 0.08). The mediation analyses suggested that the association
behavior was a mediator (indirect effecat*b = 0.023, SE = 0.010, of rs6495304 with lung cancer risk is only marginally explained
P = 0.017). However, the direct effect of the SNP on lung cancer by smoking; thus, smoking behavior is necessary, but not suf-fi
risk was still significant after adjustment for smoking behavior cient, for the association between the SNP and lung cancer risk.
(c′ = 0.278, SE = 0.049, P < 0.001), suggesting that smoking beha-v Rs6495304 also showed a statistically significant multiplicative
ior only partially explains the effect of rs6495304 on lung cancer interaction with genderP( = 0.002, data not shown), indicating
risk (ab/c = 7.62%). that gender is also important as the susceptibility to lung- can
cer; however, this might be confounded by the fact that 95% of
Variance explained by the independent variants in patients among ever-smokers were men, suggesting that this
15q25 region association may be related to the sex difference in the prop-en
On the basis of the two SNPs we identified and those reported in sity for smoking. Further, we performed stratification analyses
previous papers, we estimated the proportion of phenotypic v-ar by smoking status according to gender and only observed a
iance explained using a liability threshold model with the p-op stronger association in ever smokers of man (Supplementary
ulation prevalence of 0.06% for lung cancer1).( We performed Table 5, available atCarcinogenesis Online). Together, it was spe-c
a heritability analysis using NJMU GWAS data and found that ulated that the interaction between rs6495304 and the gender
rs6495304 and rs74733525 could explain 2.88‰, whereas the 10 was due to smoking factor.
SNPs reported previously in this region could only explain 1.29‰ Certain biologic hypotheses are consistent with this -sta
(Supplementary Table 7, available aCtarcinogenesis Online). In tistical evidence. Rs6495304, residing in the first intron of the
total, these twelve SNPs combined could explain 4.43‰ of the AGPHD1 gene, demonstrates evidence of histone modifications
phenotypic variance. The two identified variants alone explain consistent with promoter and enhancer activity in several cell
approximately 65.01% of the phenotypic variance attributable to lines. It has also been hypothesized that rs6495304 may alter
genetic variation. the POU2F2 motif. POU2F2 expressed in mammalian neuronal
cells 3(9) has been reported to be involved in important brain
Discussion rpornocaelssseusrvinivcallu,dainndg nbaeusirconcealllualnadrvfausnccutliaornsd.eIvneltohpemeexnptr,ens-seiuon
Chromosome 15q25 is a bona fide locus associated with lung quantitative trait loci analysis, we found that rs6495304 A allele
cancer risk and nicotine addiction that harbors thαe5α3β4 was significantly associated with low mRNA expression oαf3
family of nicotinic receptor genes. In this study, we provided a nAChR. Nicotine could act on these receptors to release cort-ico
more comprehensive and accurate genetic landscape at 15q25 tropin–releasing factor in axon terminals of median eminence
in Chinese individuals by using targeted resequencing tec-h (40,41). Ainhoa et al. (42) indicated that corticotropin–releasing
nology. We identified two novel genetic variants (rs6495304 factor plays a critical role in stress-induced reinstatement of
and rs74733525) that were associated with lung cancer risk nicotine-seeking behavior. Thus, corticotropin–releasing
factorand subsequently validated in large-scale GWASs within both dependent stress produced by the change oCfHRNA3 expression
Asian and European populations. We also presented data that in the hypothalamus might reinstate a previously extinguished
the rs6495304 A allele may affect lung cancer development -par nicotine-seeking behavior.
tially by pathways related to smoking behavior. In our analysis, In this study, we reported two novel SNPs on chromosome
rs6495304 and rs74733525 could explain approximately 65.01% 15q25 previously undetected in Chinese population and hig-h
of the phenotypic variance among the twelve SNPs for lung c-an lighted the role of smoking behavior in mediating the link
cer in this region. between rs6495304 and lung cancer risk. As sample size places
Many previous studies of the effect of the 15q25 locus on lung a limitation on the ability to investigate rare variants, studies
cancer risk underscored the differences in LD structure between with a larger sample sizes are needed. Further functional st-ud
populations of European, Asian and African ancestr3y4(,35).This ies are also required to confirm the roles of the newly discovered
phenomenon is consistent with early population migration and variants.
Supplementary material 17. Hu, Z. et al. (2011) A genome-wide association study identifies two new
lung cancer susceptibility loci at 13q12.12 and 22q12.2 in Han Chinese.
Supplementary data are available aCtarcinogenesis online. Nat. Genet., 43, 792–796.
18. Dong, J. et al. (2012) Association analyses identify multiple new lung
Funding cancer susceptibility loci and their interactions with smoking in the
Chinese population. Nat. Genet., 44, 895–899.
This work was funded by the National Natural Science of China 19. Landi, M.T. et al. (2009) A genome-wide association study of lung ca-n
(81230067, 81521004, 81302488 and 81422042) the Outstanding cer identifies a region of chromosome 5p15 associated with risk for
Young Fund of Jiangsu Province (grant number BK20160046), adenocarcinoma. Am. J. Hum. Genet., 85, 679–691.
National Program for Support of Top-notch Young Professionals 20. Zuzarte, P.C. et al. (2014) A two-dimensional pooling strategy for rare
from the Organization Department of the CPC Central variant detection on next-generation sequencing platforms. PLoS One,
Committee, Jiangsu Specially-Appointed Professor project, the 9, e93455.
Priority Academic Program for the Development of Jiangsu 21. Li, H. et al. (2009) Fast and accurate short read alignment with
BurrowsWheeler transform. Bioinformatics, 25, 1754–1760.
Higher Education Institutions [Public Health and Preventive 22. McKenna, A. et al. (2010) The Genome Analysis Toolkit: a MapReduce
Medicine] and Top-notch Academic Programs Project of Jiangsu framework for analyzing next-generation DNA sequencing data.
Higher Education Institutions (PPZY2015A067). The authors Genome Res., 20, 1297–1303.
wish to thank all the study participants, research staff and- stu 23. Tian, S. et al. (2016) Impact of post-alignment processing in variant-dis
dents who participated in this work. covery from whole exome data. BMC Bioinformatics, 17, 403.
24. Pruim, R.J. et al. (2010) LocusZoom: regional visualization of
genomeAcknowledgments wide association scan results. Bioinformatics, 26, 2336–2337.
25. Yang, J. et al. (2012) Conditional and joint multiple-SNP analysis of
The authors would like to thank the patients and the supporting GWAS summary statistics identifies additional variants influencing
staff in this study. complex traits. Nat. Genet., 44, 369–375.
Conflict of Interest Statement: None declared. 26. Lakota, K. et al. (2012) International cohort study of 73 anti-Ku-positive
patients: association of p70/p80 anti-Ku antibodies with joint/bone
References features and differentiation of disease populations by using
principalcomponents analysis. Arthritis Res. Ther., 14, R2.
1. Ferlay, J. et al. (2015) Cancer incidence and mortality worldwide: 27. Baron, R.M. et al. (1986) The moderator-mediator variable distinction
sources, methods and major patterns in GLOBOCAN 2012. Int. J. Ca-n in social psychological research: conceptual, strategic, and statistical
cer, 136, E359–E386. considerations. J. Pers. Soc. Psychol., 51, 1173–1182.
2. Chen, W. et al. (2016) Cancer statistics in China, 2015. CA Cancer J. Clin., 28. VanderWeele, T.J. (2016) Mediation analysis: a practitioner’s guide.
66, 115–132. Annu. Rev. Public Health, 37, 17–32.
3. Hecht, S.S. (2003) Tobacco carcinogens, their biomarkers and tobacco- 29. Preacher, K.J. et al. (2008) Asymptotic and resampling strategies for
induced cancer. Nat. Rev. Cancer, 3, 733–744. assessing and comparing indirect effects in multiple mediator models.
4. Li, Q. et al. (2011) Prevalence of smoking in China in 2010. N. Engl. Behav. Res. Methods, 40, 879–891.
J. Med., 364, 2469–2470. 30. Huang,Y.T. (2015) Integrative modeling of multi-platform genomic data
5. Batra, A. et al. (2009) Treatment of tobacco dependence: a responsibility under the framework of mediation analysis. Stat. Med., 34, 162–178.
of psychiatry and addiction medicine. Nervenarzt, 80, 1022–9. 31. Lee, S.H. et al. (2012) A better coefficient of determination for genetic
6. Bierut, L.J. et al. (1998) Familial transmission of substance dependence: profile analysis. Genet. Epidemiol., 36, 214–224.
alcohol, marijuana, cocaine, and habitual smoking: a report from the 32. Hsiung, C.A. et al. (2010) The 5p15.33 locus is associated with risk of
Collaborative Study on the Genetics of Alcoholism. Arch. Gen. Psyc-hia lung adenocarcinoma in never-smoking females in Asia. PLoS Genet.,
try, 55, 982–988. 6, 182–188.
7. Li, M.D. et al. (2003) A meta-analysis of estimated genetic and envir-on 33. Ward, L.D. et al. (2012) HaploReg: a resource for exploring chromatin
mental effects on smoking behavior in male and female adult twins. states, conservation, and regulatory motif alterations within sets of
Addiction, 98, 23–31. genetically linked variants. Nucleic Acids Res., 40, D930–D934.
8. Wei, C. et al. (2011) A case-control study of a sex-specific association 34. Ware, J.J. et al. (2012) From men to mice: CHRNA5/CHRNA3, smoking
between a 15q25 variant and lung cancer risk. Cancer Epidemiol. B-io behavior and disease. Nicotine Tob. Res., 14, 1291–1299.
markers Prev., 20, 2603–2609. 35. Improgo, M.R. et al. (2010) From smoking to lung cancer: the CHRNA5/
9. Fowler, C.D. et al. (2011) Habenulaαr5 nicotinic receptor subunit sign-al A3/B4 connection. Oncogene, 29, 4874–4884.
ling controls nicotine intake. Nature, 471, 597–601. 36. Zietkiewicz, E. et al. (1997) Nuclear DNA diversity in worldwide dist-rib
10. Amos, C.I. et al. (2008) Genome-wide association scan of tag SNPs ide-n uted human populations. Gene, 205, 161–171.
tifies a susceptibility locus for lung cancer at 15q25.1. Nat. Genet., 40, 37. VanderWeele, T.J. et al. (2012) Genetic variants on 15q25.1, smoking,
616–622. and lung cancer: an assessment of mediation and interaction. Am.
11. Hung, R.J. et al. (2008) A susceptibility locus for lung cancer maps to J. Epidemiol., 175, 1013–1020.
nicotinic acetylcholine receptor subunit genes on 15q25. Nature, 452, 38. Wang, J. et al. (2010) Mediating effects of smoking and chronic obstr-uc
633–637. tive pulmonary disease on the relation between the CHRNA5-A3
12. Thorgeirsson,T.E. et al. (2008) A variant associated with nicotine depe-nd genetic locus and lung cancer risk. Cancer, 116, 3458–3462.
ence, lung cancer and peripheral arterial disease. Nature, 452, 638–642. 39. He, X. et al. (1989) Expression of a large family of POU-domain regu-la
13. Wu, C. et al. (2009) Genetic variants on chromosome 15q25 associated tory genes in mammalian brain development. Nature, 340, 35–41.
with lung cancer risk in Chinese populations. Cancer Res., 69, 5065–5072. 40. Okuda, H. et al. (1993) The presence of corticotropin-releasing
factor14. Laitinen, V.H. et al.; PRACTICAL Consortium. (2015) Fine-mapping the like immunoreactive synaptic vesicles in axon terminals with nicotinic
2q37 and 17q11.2-q22 loci for novel genes and sequence variants ass-o acetylcholine receptor-like immunoreactivity in the median eminence
ciated with a genetic predisposition to prostate cancer. Int. J. Cancer, of the rat. Neurosci. Lett., 161, 183–186.
136, 2316–2327. 41. Bugajski, J. et al. (2002) Involvement of prostaglandins in the
nicotine15. Kachuri, L. et al. (2016) Fine mapping of chromosome 5p15.33 based induced pituitary-adrenocortical response during social stress. J. P-hys
on a targeted deep sequencing and high density genotyping identifies iol. Pharmacol., 53(4 Pt 2), 847–857.
novel lung cancer susceptibility loci. Carcinogenesis, 37, 96–105. 42. Plaza-Zabala, A. et al. (2010) Hypocretins regulate the
anxiogenic16. Salomon, M.P. et al. (2016) GWASeq: targeted re-sequencing follow up like effects of nicotine and induce reinstatement of nicotine-seeking
to GWAS. BMC Genomics, 17, 176. behavior. J. Neurosci., 30, 2300–2310.