Targeted sequencing of chromosome 15q25 identified novel variants associated with risk of lung cancer and smoking behavior in Chinese

Carcinogenesis, May 2017

Cheng, Yang, Wang, Cheng, Zhu, Meng, Dai, Juncheng, Wang, Yuzhuo, Geng, Liguo, Li, Zhihua, Zhang, Jiahui, Ma, Hongxia, Jin, Guangfu, et al.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

Targeted sequencing of chromosome 15q25 identified novel variants associated with risk of lung cancer and smoking behavior in Chinese

Carcinogenesis Targeted sequencing of chromosome 15q25 identified novel variants associated with risk of lung cancer and smoking behavior in Chinese Yang Cheng 2 Cheng Wang 2 Meng Zhu 2 Juncheng Dai 1 2 Yuzhuo Wang 2 Liguo Geng 2 Zhihua Li 2 Jiahui Zhang 2 Hongxia Ma 1 2 Guangfu Jin 1 2 Dongxin Lin 0 Zhibin Hu 1 2 Hongbing Shen 1 2 0 State Key Laboratory of Molecular Oncology, Cancer Institute and Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College , Beijing 100021 , China 1 Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center of Cancer Medicine, Nanjing Medical University , Nanjing 211166 , China 2 Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University , Nanjing 211166 , China Previous genome-wide association studies (GWAS) in populations of European descent identified a lung cancer susceptibility locus at 15q25 that was biologically associated with nicotine addiction. However, the allele frequency of susceptibility variants identified in this region varied dramatically across European and Asian populations, suggesting that additional risk single nucleotide polymorphism (SNPs) in Asians need to be identified. Thus, we conducted a fine-mapping study of chromosome 15q25 using targeted resequencing of 200 lung cancer cases and 300 controls of Chinese descent. An approximate conditional and joint analysis of the discovery data revealed two novel SNPs with independent effects (rs6495304: OR = 1.79, P = 9.37 × 10−4; and rs74733525: OR = 1.68,P = 8.05 × 10−3). Both variants were common in Asians but rare or monomorphic in Whites. These results were further supportedinbysilico validation including 8047 cases and 8898 controls from multiethnic lung cancer genome-wide association studies (GWASs) (rs6495304: OR = 1P.3 =2 ,1.21 × 10-11; and rs74733525: OR = 1.29, P = 4.29 × 10−4); however, rs6495304 demonstrated significant effects only in ever-smokerPs =( 0.004 for heterogeneity test of smoking). Mediation analysis indicated that smoking behavior may mediate the effect of rs6495304 on lung cancer risk. Furthermore, expression quantitative trait loci analysis showed the risk allele (A) of rs6495304 was significantly associated with lower mRNA expression CoHfRNA3 (P = 0.029) in 81 hypothalamic tissue samples. This finding provides new insights into the association between lung cancer susceptibility and the 15q25 locus. - Introduction Lung cancer is the leading cause of cancer-related deaths in the world’s leading tobacco producing and consuming country, world, with more than one million deaths annual1ly). I(n 2015, China is home to one-third of all smokers globally. In 2010, it was estimated annual lung cancer deaths in China had increased estimated that there were 301 million current smokers in China to 610200 (2). Smoking is the major risk factor for lung cancer, and that 85.6% of them smoked daily4(). Although new, efficarelating to approximately 90% of lung cancer case3s). (As the cious techniques for smoking cessation have helped to reduce Abbreviations disease conducted in Jiangsu Province during the same period GWAS genome-wide association studies when the cases were recruited. Lung cancer patients who had MAF minor allele frequency a history of cancer, metastasized cancer from other organs, nAChR nicotinic acetylcholine receptor subunit radiotherapy or chemotherapy were excluded. Controls were SNP single nucleotide polymorphism frequency-matched to the cases for age (±5 years) and sex. All subjects were genetically unrelated and of Han Chinese descent. the number of smokers significantly, less than 5% succeed. Of Participants were face-to-face interviewed by trained intervi-ew these smokers, approximately 60% are addicted to nicotin5e).( ers to collect information on demographic data and provided Many aspects of cigarette-smoking behavior cluster in an approximately 5 ml venous blood sample. Individuals were families (6). Evidence from twin and family studies revealed defined as smokers if they had smoked an average of one cig-a that genetic factors are important to the etiology of nicotinerette or more per day for at least 1 year in their lifetime; o-ther dependence, with an estimated heritability of 0.567)(. The risk wise, they were defined as nonsmokers. of developing nicotine dependence for the sibling of a nicotine- In the replication stage, samples were obtained from our dependent individual is twice that of the general populatio6n). ( published NJMU GWAS data (Nanjing Medical University GWAS A cluster of three genesC,HRNA5, CHRNA3 and CHRNB4 on from Nanjing and Beijing: 2331 cases and 3077 controls1)7(,18) chromosome 15q25 encode neuronal nicotinic acetylcholine and NCI GWASs (the National Cancer Institute GWASs: 5716 receptor subunits (nAChRs). nAChRs belong to the super fa-m cases and 5821 controls) 1(9). Detailed information about the ily of ligand-gated ion channels that can mediate fast signal subjects involved in this study is shown (see Supplementary transmission at synapses and modulate the release of several Table 1, available atCarcinogenesis Online). NCI GWASs obtained neurotransmitters. They are also the initial physiological- tar via the database of Genotypes and Phenotypes (dbGAP) gets of nicotine in the central and peripheral nervous system. included samples from four studies: (i) the Environment and Within a few seconds of smoking, nicotine is delivered to the Genetics in Lung Cancer Etiology (EAGLE) study; (ii) the Alphasynapses where these receptors are expressed to initiate ni-co Tocopherol, Beta-Carotene Cancer Prevention Study (ATBC); (iii) tine addiction 8(). Consistent with an important role for nAChRs the Cancer Prevention Study II Nutrition Cohort (CPSII) and (iv) in regulating nicotine intake, knockdown oαf5 nAChR subunits the Prostate, Lung, Colon, Ovary Screening Trial (PLCO19))(. We in the brain region in rats decreased their sensitivity to reward- obtained EAGLE study data from dbGaP phs000093.v2.p2, which inhibiting actions compared with that of control ra9t).s ( included 1945 cases and 1992 controls (SLD) and the other three Recent genome-wide association studies (GWASs)1(0–12) in studies data from dbGaP phs000336.v1.p1, which included 3782 Caucasian populations identified three single nucleotide p-oly cases and 3840 controls (CADM). Replication stage analyses morphism (SNPs) (rs8034191, rs1051730 and rs16969968) at this were conducted separately with subsequent meta-analysis. locus that were associated with lung cancer risk. It was -sug gested that these SNPs might exert their effects on lung cancer Quality control and imputation of GWAS data sets through a nicotine-related pathway. However, it is worth noting The NJMU GWAS contained 2331 cases and 3077 controls was that the allelic frequencies of the variants identified were low conducted using an Affymetrix Genome-Wide Human SNP in Asians (minor allele frequency, MAF < 0.05), and no assoc-ia Array 6.0 with standard quality control procedures described in tion was found in relation to smoking behaviors or lung cancer previous papers (17,18). Although the NCI GWASs from dbGaP risk in Chinese populations 1(3). Our previous work identified were not deposited until after the initial quality control, we four novel genetic variants at this locus using haplotype-tagging performed standard quality control on the data. We excluded strategy1(3), but may have missed variants outside the selected individuals with low call rates (95%), familial relationships and linkage disequilibrium (LD) block. Targeted resequencing, which extreme heterozygosity rates and SNPs with low call rates (95%), is frequently used in fine-mapping studies, can provide a com- MAFs  <  0.05 and P  <  1  ×  10–6 for the Hardy–Weinberg equili-b prehensive association map 1(4–16). rium. As a result, 1937 cases and 1984 controls from SLD and In this study, we aimed to identify novel genetic variations at 3779 cases and 3837 controls from CADM remained. To facilitate 15q25 associated with lung cancer risk in the Han Chinese po-p further analysis, the genotyping data were imputed using data ulation. We conducted a two-stage fine-mapping study, which from the 1000 Genomes Project (the Phase III integrated va-ri consisted of a discovery cohort with 200 lung cancer cases and ant set release, across 2504 samples) as a reference. We phased 300 controls and anin silico validation cohort with 8047 cases the haplotypes with Shapeit v2h(ttp://,Phasing and 8898 controls. step) and performed imputations with IMPUTE2h(ttp:// Poorly imputed SNPs, defined by an information measure (Is) < 0.40, were fu-r Materials and methods ther excluded from the analysis. Study subjects Targeted re-sequencing and genotyping We performed a two-stage case–control analysis. It was approved We explored the LD structure around the SNP rs1051730, as by the Institutional Review Board of Nanjing Medical University. reported by GWASs 1(0–12), using the HapMap Project dat-a Informed consent was obtained from each subject at recru-it base (phase II + III Feb 09, on NCBI B36 assembly, db SNP126). ment. In the discovery stage, 200 lung cancer cases and 300 We identified an LD block spanning from chr15:78800300 to controls were identified for targeted resequencing; these in-di chr15:78983700 (hg19) in our study. In total, 210 probes (total viduals were also included in our previous GWAS data17(,18). size: 183  kb; coverage: 76.19%) were designed using Agilent Cases were defined as newly diagnosed lung cancer patients SureSelect software h(ttp:// and consecutively recruited from the Cancer Hospital of Jiangsu Genomic DNA was captured according to the standard Agilent Province and the First Affiliated Hospital of Nanjing Medical SureSelect protocol (Agilent Technologies, Santa Clara, CA, University beginning in 2003. Controls were randomly selected USA), and associated captured libraries were sequenced on the from a community-based screening program for noninfectious Genome Analyzer IIx (Illumina, San Diego, CA, USA)2(0). After removing reads containing sequencing adapters and low-qu-al address the case–control design by fitting the mediator model ity reads using the FASTQ Quality Filter tool, high-quality reads only among controls2(8). PC was used as a measure of smoking were aligned to the human reference genome version hg19 behavior. For each SNP, the mediation analyses were based on using Burrows-Wheeler Aligner (BWA, V.0.5.9)h(ttp://bio-bwa. the following three regression modelsEq(uations 1–3). All the (21). Picard Tools h(ttp://picard.sourceforge. models were adjusted for covariates of age and sex. net/) was used to mark duplicates, and base quality scores M = i2 + a ∗ X + e2 ∗Covariates (1) were recalibrated using the Genome Analysis Toolkit (GATK, v1.0.5974) from the Broad Institute22(). Finally, variant calling Y = i1 + c ∗ X + e1 ∗Covariates (2) was performed through GATK and Freebayes2(3), and only var-i ants identified by both tools were considered. Y = i3 + c’ ∗ X + b ∗ M + e3 ∗Covariates (3) For quality control, we excluded 8 affected subjects and 22 control subjects because they (i) had a concordance rate <90% as Here, X, M and Y denote the genetic variants, smoking beha-v determined by comparing genotypes against existing GWAS g-en ior and lung cancer, respectively. The termi1s–i3 represent the otypes or (ii) yielded a read depth <1×0 across samples. A total error terms for each equation, aned1–e3 represent coefficients of of 1385 variants were detected through targeted resequencing; covariates. In these equations, theindirect effecta, denoted as a, of these, 1016 were excluded from subsequent analyses for the was the linear regression coefficient of the mediator on an SNP. following reasons: (i) call rate of genotype < 90%; (Pii)value for The total effect, denoted as c, was the logistic regression coef-fi Hardy–Weinberg equilibrium < 1×  10–4 in case, control or all su-b cient of lung cancer on the SNP without considering the med-ia jects; (iii) MAF < 0.05 (control); (iv) concordance rate < 90% with tor. Theindirect effectb, denoted as b, was the logistic regression the previous GWAS for overlapping variants; or (v) absence from coefficient of lung cancer on the mediator when considering the the 1000 Genomes Project w( (QC details SNP and the direct effect, denoted as c′, was the logistic regre-s in Supplementary Figure  1, available aCtarcinogenesis Online). sion coefficient of lung cancer on the SNP when considering the Finally, 369 common SNPs in 192 affected subjects and 278 co-n mediator. The product ofindirect effecta and indirect effectb was trols were retained for further association analysis. termed the indirect effect. Bootstrapping was used to evaluate the significance of theindirect effect based on a resample-based Association analysis of variants with lung method with replacement2(9,30). The proportion mediated by cancer risk smoking behavior is given by the ratio between thiendirect effect In the discovery stage, associations between each variant and and thetotal effect. lung cancer risk were calculated using logistic regression m-od els with adjustment for age, sex and smoking status in PLINK Explained variance 1.90. Regional plots were created using LocusZoom24(). An Variants identified in our current analysis and those reported approximate conditional and joint analysis approach using in previous studies at this locus were used to calculate the genome-wide complex trait analysis (GCTA: Online Methods) respective variances by assuming the prevalence of lung ca-n awnads tpherofsoerwmietdh tnoomseilneaclt einviddeexnvcaeroiafnatsssoicniaatisopnePc (<if i0c.0r5e)gwioe2nr5e() tcherentotrbaen0s.0fo6r%m(1e)d.RWo2etoobRt12aoinnedthReo2lfiraobmilitaylisnceaalre rbeygurseisnsRgiocnoadned selected for validation. In the validation stage, association a-nal described previously 3(1). yses of the SNPs identified were performed using SNPTEST v2.5 under a probabilistic dosage model with adjustment for age, Functional element analysis and gene expression sex, pack-years and the first principal component (PC) in NJMU analysis GWAS; age, sex, smoking status and the first PC in SLD GWAS; The novel SNPs were investigated for the presence of chromatin and age, sex and the first PC in CADM GWAS. Finally, fixed-effect histone marks and hypersensitive DNAse elements using data meta-analysis was conducted to assess the pooled genetic from ENCODE included in HaploReg v4.1 h(ttp://www.broadeffects. Cochran’s Q statistic and I2 were calculated using STATA the UCSC software (V.8.0, College Station, TX, USA). General analyses genome browser ( were performed with R software (V.3.1.1; The R Foundation for The results and boxplots of the expression quantitative trait loci Statistical Computing). analysis in the key regions of brain from the genotype-tissue expression (GTEx) project were obtained from the GTEx Portal Association of SNPs with smoking behavior (http://www.gtexportal.or)g./ Principal component analysis is a useful tool to search for important characteristics among correlated variables. PrincipalResults component analysis was performed using the pRrcomp function with standard parameters26(). It allows the identification of Identification of new lung cancer susceptibility SNPs latent variables [principal components (PCs)] in the data based We found several significant signals at 15q25 (see Supplementary on observed variables (in our study, four parameters: pack- Figure 2A, available atCarcinogenesis Online). Through approx-i years, cigarettes per day, smoking duration and smoking s-ta mate conditioning with GCTA, we identified two novel ind-e tus). A Wilcoxon test was used to assess the impact of variants pendent SNPs significantly associated with lung cancer risk on smoking phenotypes such as pack-years, cigarettes per day, (Table 1): rs6495304 (OR = 1.79, 95% CI: 1.27–2.50, P = 9.37 × 10–4) smoking duration and the top PC. in the intron ofAGPHD1 and rs74733525 (OR = 1.68, 95% CI: 1.14– 2.46, P = 8.05 × 10–3) in the intron ofCHRNA3. Based on LD analyMediation analysis sis (Supplementary Figure 2B, available aCtarcinogenesis Online), We implemented the Baron and Kenny approach for mediation we found that rs6495304 and rs74733525 were not located in analysis (27) to assess whether smoking behavior mediated the the previously reported haplotypesr2 (<  0.20) (Supplementary effect of SNP on lung cancer risk, and adapted this approach to Table 2, available atCarcinogenesis Online). Ten genetic variants have been identified by previous studies Potential regulatory mechanisms underlying the and association analyses between these variants and lung c-an association between rs6495304 and smoking cer risk were demonstrated (Supplementary Table  3, available behavior SascaaodwuaNttntuighnteACdPhgecasGtrreeropsP(csrrf1Hritfosn6eariD1uonv9ii0g1srl6ig5oeki9aSnn1u.o9nNe7gsu6Bs3dP8reiri0snsserariapO(stpedrno6Csnres6rduHosl97t,lib3Rrn2st8reNs,8ess68ad2A)180w.ne5224Tasde)92thiara1g3Ceeen4n7fnHfd3rd4d.eR8eicralNrl5tsuteiACa1cansH2b5tingz9)liRdoee1cwNs.n0arWeA9wsnsr83c6eeoe4,e4rfarara9esels5Ctr8esfs3fHiso0oes0s3Rckcc9e4iNtonwa1swAntt9ebeif3e1raidre,realetrmltcwswyAoe2eiexnGt0iddech3sPnetl6iHsnulh5ftDudta3oen1ie4tuncdgrattlh,e IfsrtSaTmnoiehountuafepneearneprodkxp(deltDdepoenaditHlsmtctohiSieteberoa)idyneaontiltnn,ntalratheesnwr1hwl6ey2eee4Hl,i9pdyFgtr35oiihehKgf3itgftdeu04eeue4Mrnramlneetlawe tntai1t4aiaettfoAlrisrsrecn,iiydmclegaaoelntemvvlscaecaaaolhliiratnlitainelad(ainldSfebineiucslpsltiaepnmerastpenaoeDlduCddtfoeiNsacmfriatrantscthenis6giohnnea4neeot9tHInga5 aamechh3rnpsoy0yaoeslm4pnsodoTeicdcesRparelieaebsarsOcg3reltr3neneebie() dollsa.a6einisnsWttmseeieoa,vdede)ni.wntttodhytehnee affinity of POU class 2 homeobox 2 (POU2F2) (Supplementary In silico validation of rs6495304 and rs74733525 Figure  4B, available atCarcinogenesis Online), a common tranWe further validated the associations between the rs6495304 scription factor binding site in immunoglobulin gene p-ro and rs74733525 genetic variants and lung cancer risk usining moters. Further expression quantitative trait loci annotation silico data from multiple lung cancer GWASs conducted in Asian suggested that the A  allele of rs6495304 was associated with and European populations. In Asian GWAS subjects, both SNPs the decreased expression of CHRNA3 in the hypothalamus were associated with increased odds of lung cancer (rs6495304: (P = 0.029, Supplementary Figure 4C, available aCtarcinogenesis OR  =  1.35, 95% CI: 1.22–1.47, P  =  5.16  ×  10−10; rs74733525: Online). OR = 1.29, 95% CI: 1.12–1.49, P = 4.29 × 10–4, Table 1). In Caucasian GWAS subjects, rs6495304 demonstrated a weaker significance Single—SNP association with smoking behavior (OR  =  1.24, 95% CI: 1.06–1.46, P  =  6.27  ×  10−3), with the minor In addition, a significant multiplicative interaction was observed allele of rs6495304 occurring at a much lower frequency; and between rs6495304 and smoking behavior P( = 0.004, Table 3). We rs74733525 was monomorphic T(able 1). We further queried the further repeated the interaction analysis with smoking using associations within never-smoking women in Asia3(2); within the pooled SLD and NJMU GWAS data sets, and obtained sim- i this population, these variants did not show any significance lar but more pronounced statistically significanPc e= (1.59 × 10−6, (data not shown). Supplementary Table  4, available at Carcinogenesis Online). We Stratification analyses were performed by using only NJMU next examined the association between rs6495304 and smoking GWAS data because both SNPs were rare or monomorphic in behavior in all 5408 subjects. The frequency of the A allele was European populations. We observed that the risk associated higher in heavier smokers and subjects with longer smoking with rs6495304 was significantly different between the s-ub duration (consume > 48 pack-years, >25 cigarettes per day, PC > 35 groups by gender and smoking status (test for heteroge-ne and smoke > 40 years, Table 4). Further analysis suggested that ity, P  =  2.00  ×  10−3 and 4.30  ×  10−3, respectively), but not the carriers of A allele (adverse) in control subjects who were current subgroups by histology for either variantP  (=  0.50 and 0.89, smokers (n = 1083) were more likely to be heavy smokers (packTable  2). However, we still found that rs6495304 was sign-ifi years: P = 0.043, smoking duration:P = 0.007, and PC: P = 0.028) cantly associated with different subtypes of non-small cell (Supplementary Figure 3, available aCtarcinogenesis Online). lung cancer and rs74733525 was only significantly associated with adenocarcinoma, further confirming that these variants Mediation analysis in the CHRN genes have different effects in different subtypes To provide further evidence that smoking behavior was a med-ia of lung cancer. tor of the association between rs6495304 and lung cancer, we rs6495304 Casea AC, adenocarcinoma; SC, squamous cell carcinoma; SCC, small cell carcinoma; other, includes large cell lung cancer and mixed cell carcinoma. aWild-type homozygote/heterozygote/variant homozygote. bAdjusted by age, gender and pack-years of smoking where appropriate. cP for heterogeneity. aCategorized using the 25th and 75th percentile pack-years, cigarette consumption, years smoked and first principal component derived from smoking variables. conducted mediation analysisF(igure 1). In our model, the total the PC and the effect of the PC on lung cancer risk were also effect of the SNP on lung cancer risk was statistically significant significant a( = 1.672, SE = 0.714, P = 0.019, b = 0.014, SE = 0.001, (c  =  0.302, SE  =  0.048, P  <  0.001). Both the effect of the SNP on P  <  0.001). Bootstrapping analysis suggested that smoking evolution patterns3(6). Refining this region may help to identify association signals that are more likely to be population specific. The two variants identified in our study varied greatly in allelic frequencies among different ethnicities. Rs6495304 is common in Han Chinese in Beijing (CHB: MAF = 0.24) but rare in Centre d’Etude du Polymorphisme Humain Utah (CEU: MAF  =  0.01) according to data from the 1000 Genomes Project. Rs74733525 is common in CHB (MAF = 0.08) and monomorphic in CEU. In addition, studies have investigated the mediation effect of smoking behavior on the relationship between variants at the 15q25 region and lung cancer risk with little consensus as to the relative impact of the variants on the propensity for smoking or a direct carcinogenic effect. VanderWeeleet al. (37) found no evidence that theCHRNA5-A3-B4 was associated with lung ca-n cer, in interaction with smoking, whereas others found that this region indirectly acts on lung cancer risk through the mediation Faicgquurier e1d. Mfreodmiatsimonokainnaglysstiast:ugsr,apchigiacareltrteepsr/desaeyn,tsamtiooknin.gSmdoukriantgiobnehaanvdiorpa(PcCk:- of smoking behavior (38). In our study, we demonstrated that years) mediation model of the relationship between rs6495304 and lung cancer, smoking was a mediator of the association between rs6495304 adjusted by age and sex. In the model, the percentage of the effect mediated by and lung cancer risk, which accounts for only 7.62% of the total smoking behavior is ~7.62% of the total effect of rs6495304 on lung cancer risk. effect of rs6495304 on lung cancer risk. We further confirmed the *P < 0.05; **P < 0.001. mediating effect by examining the effect among never-smokers (P = 0.08). The mediation analyses suggested that the association behavior was a mediator (indirect effecat*b = 0.023, SE = 0.010, of rs6495304 with lung cancer risk is only marginally explained P = 0.017). However, the direct effect of the SNP on lung cancer by smoking; thus, smoking behavior is necessary, but not suf-fi risk was still significant after adjustment for smoking behavior cient, for the association between the SNP and lung cancer risk. (c′ = 0.278, SE = 0.049, P < 0.001), suggesting that smoking beha-v Rs6495304 also showed a statistically significant multiplicative ior only partially explains the effect of rs6495304 on lung cancer interaction with genderP( = 0.002, data not shown), indicating risk (ab/c = 7.62%). that gender is also important as the susceptibility to lung- can cer; however, this might be confounded by the fact that 95% of Variance explained by the independent variants in patients among ever-smokers were men, suggesting that this 15q25 region association may be related to the sex difference in the prop-en On the basis of the two SNPs we identified and those reported in sity for smoking. Further, we performed stratification analyses previous papers, we estimated the proportion of phenotypic v-ar by smoking status according to gender and only observed a iance explained using a liability threshold model with the p-op stronger association in ever smokers of man (Supplementary ulation prevalence of 0.06% for lung cancer1).( We performed Table 5, available atCarcinogenesis Online). Together, it was spe-c a heritability analysis using NJMU GWAS data and found that ulated that the interaction between rs6495304 and the gender rs6495304 and rs74733525 could explain 2.88‰, whereas the 10 was due to smoking factor. SNPs reported previously in this region could only explain 1.29‰ Certain biologic hypotheses are consistent with this -sta (Supplementary Table  7, available aCtarcinogenesis Online). In tistical evidence. Rs6495304, residing in the first intron of the total, these twelve SNPs combined could explain 4.43‰ of the AGPHD1 gene, demonstrates evidence of histone modifications phenotypic variance. The two identified variants alone explain consistent with promoter and enhancer activity in several cell approximately 65.01% of the phenotypic variance attributable to lines. It has also been hypothesized that rs6495304 may alter genetic variation. the POU2F2 motif. POU2F2 expressed in mammalian neuronal cells 3(9) has been reported to be involved in important brain Discussion rpornocaelssseusrvinivcallu,dainndg nbaeusirconcealllualnadrvfausnccutliaornsd.eIvneltohpemeexnptr,ens-seiuon Chromosome 15q25 is a bona fide locus associated with lung quantitative trait loci analysis, we found that rs6495304 A allele cancer risk and nicotine addiction that harbors thαe5α3β4 was significantly associated with low mRNA expression oαf3 family of nicotinic receptor genes. In this study, we provided a nAChR. Nicotine could act on these receptors to release cort-ico more comprehensive and accurate genetic landscape at 15q25 tropin–releasing factor in axon terminals of median eminence in Chinese individuals by using targeted resequencing tec-h (40,41). Ainhoa et al. (42) indicated that corticotropin–releasing nology. We identified two novel genetic variants (rs6495304 factor plays a critical role in stress-induced reinstatement of and rs74733525) that were associated with lung cancer risk nicotine-seeking behavior. Thus, corticotropin–releasing factorand subsequently validated in large-scale GWASs within both dependent stress produced by the change oCfHRNA3 expression Asian and European populations. We also presented data that in the hypothalamus might reinstate a previously extinguished the rs6495304 A allele may affect lung cancer development -par nicotine-seeking behavior. tially by pathways related to smoking behavior. In our analysis, In this study, we reported two novel SNPs on chromosome rs6495304 and rs74733525 could explain approximately 65.01% 15q25 previously undetected in Chinese population and hig-h of the phenotypic variance among the twelve SNPs for lung c-an lighted the role of smoking behavior in mediating the link cer in this region. between rs6495304 and lung cancer risk. As sample size places Many previous studies of the effect of the 15q25 locus on lung a limitation on the ability to investigate rare variants, studies cancer risk underscored the differences in LD structure between with a larger sample sizes are needed. Further functional st-ud populations of European, Asian and African ancestr3y4(,35).This ies are also required to confirm the roles of the newly discovered phenomenon is consistent with early population migration and variants. Supplementary material 17. Hu, Z. et al. (2011) A genome-wide association study identifies two new lung cancer susceptibility loci at 13q12.12 and 22q12.2 in Han Chinese. Supplementary data are available aCtarcinogenesis online. Nat. Genet., 43, 792–796. 18. Dong, J. et al. (2012) Association analyses identify multiple new lung Funding cancer susceptibility loci and their interactions with smoking in the Chinese population. Nat. Genet., 44, 895–899. This work was funded by the National Natural Science of China 19. Landi, M.T. et al. (2009) A genome-wide association study of lung ca-n (81230067, 81521004, 81302488 and 81422042) the Outstanding cer identifies a region of chromosome 5p15 associated with risk for Young Fund of Jiangsu Province (grant number BK20160046), adenocarcinoma. Am. J. Hum. Genet., 85, 679–691. National Program for Support of Top-notch Young Professionals 20. Zuzarte, P.C. et al. (2014) A two-dimensional pooling strategy for rare from the Organization Department of the CPC Central variant detection on next-generation sequencing platforms. PLoS One, Committee, Jiangsu Specially-Appointed Professor project, the 9, e93455. Priority Academic Program for the Development of Jiangsu 21. Li, H. et al. (2009) Fast and accurate short read alignment with BurrowsWheeler transform. Bioinformatics, 25, 1754–1760. Higher Education Institutions [Public Health and Preventive 22. McKenna, A. et al. (2010) The Genome Analysis Toolkit: a MapReduce Medicine] and Top-notch Academic Programs Project of Jiangsu framework for analyzing next-generation DNA sequencing data. Higher Education Institutions (PPZY2015A067). The authors Genome Res., 20, 1297–1303. wish to thank all the study participants, research staff and- stu 23. Tian, S. et al. (2016) Impact of post-alignment processing in variant-dis dents who participated in this work. covery from whole exome data. BMC Bioinformatics, 17, 403. 24. Pruim, R.J. et al. (2010) LocusZoom: regional visualization of genomeAcknowledgments wide association scan results. Bioinformatics, 26, 2336–2337. 25. Yang, J. et  al. (2012) Conditional and joint multiple-SNP analysis of The authors would like to thank the patients and the supporting GWAS summary statistics identifies additional variants influencing staff in this study. complex traits. Nat. Genet., 44, 369–375. Conflict of Interest Statement: None declared. 26. Lakota, K. et al. (2012) International cohort study of 73 anti-Ku-positive patients: association of p70/p80 anti-Ku antibodies with joint/bone References features and differentiation of disease populations by using principalcomponents analysis. Arthritis Res. Ther., 14, R2. 1. Ferlay, J. et  al. (2015) Cancer incidence and mortality worldwide: 27. Baron, R.M. et  al. (1986) The moderator-mediator variable distinction sources, methods and major patterns in GLOBOCAN 2012. Int. J. Ca-n in social psychological research: conceptual, strategic, and statistical cer, 136, E359–E386. considerations. J. Pers. Soc. Psychol., 51, 1173–1182. 2. Chen, W. et al. (2016) Cancer statistics in China, 2015. CA Cancer J. Clin., 28. VanderWeele, T.J. (2016) Mediation analysis: a practitioner’s guide. 66, 115–132. Annu. Rev. Public Health, 37, 17–32. 3. Hecht, S.S. (2003) Tobacco carcinogens, their biomarkers and tobacco- 29. Preacher, K.J. et  al. (2008) Asymptotic and resampling strategies for induced cancer. Nat. Rev. Cancer, 3, 733–744. assessing and comparing indirect effects in multiple mediator models. 4. Li, Q. et  al. (2011) Prevalence of smoking in China in 2010. N. Engl. Behav. Res. Methods, 40, 879–891. J. Med., 364, 2469–2470. 30. Huang,Y.T. (2015) Integrative modeling of multi-platform genomic data 5. Batra, A. et al. (2009) Treatment of tobacco dependence: a responsibility under the framework of mediation analysis. Stat. Med., 34, 162–178. of psychiatry and addiction medicine. Nervenarzt, 80, 1022–9. 31. Lee, S.H. et al. (2012) A better coefficient of determination for genetic 6. Bierut, L.J. et al. (1998) Familial transmission of substance dependence: profile analysis. Genet. Epidemiol., 36, 214–224. alcohol, marijuana, cocaine, and habitual smoking: a report from the 32. Hsiung, C.A. et al. (2010) The 5p15.33 locus is associated with risk of Collaborative Study on the Genetics of Alcoholism. Arch. Gen. Psyc-hia lung adenocarcinoma in never-smoking females in Asia. PLoS Genet., try, 55, 982–988. 6, 182–188. 7. Li, M.D. et al. (2003) A meta-analysis of estimated genetic and envir-on 33. Ward, L.D. et al. (2012) HaploReg: a resource for exploring chromatin mental effects on smoking behavior in male and female adult twins. states, conservation, and regulatory motif alterations within sets of Addiction, 98, 23–31. genetically linked variants. Nucleic Acids Res., 40, D930–D934. 8. Wei, C. et al. (2011) A case-control study of a sex-specific association 34. Ware, J.J. et al. (2012) From men to mice: CHRNA5/CHRNA3, smoking between a 15q25 variant and lung cancer risk. Cancer Epidemiol. B-io behavior and disease. Nicotine Tob. Res., 14, 1291–1299. markers Prev., 20, 2603–2609. 35. Improgo, M.R. et al. (2010) From smoking to lung cancer: the CHRNA5/ 9. Fowler, C.D. et al. (2011) Habenulaαr5 nicotinic receptor subunit sign-al A3/B4 connection. Oncogene, 29, 4874–4884. ling controls nicotine intake. Nature, 471, 597–601. 36. Zietkiewicz, E. et al. (1997) Nuclear DNA diversity in worldwide dist-rib 10. Amos, C.I. et al. (2008) Genome-wide association scan of tag SNPs ide-n uted human populations. Gene, 205, 161–171. tifies a susceptibility locus for lung cancer at 15q25.1. Nat. Genet., 40, 37. VanderWeele, T.J. et  al. (2012) Genetic variants on 15q25.1, smoking, 616–622. and lung cancer: an assessment of mediation and interaction. Am. 11. Hung, R.J. et al. (2008) A susceptibility locus for lung cancer maps to J. Epidemiol., 175, 1013–1020. nicotinic acetylcholine receptor subunit genes on 15q25. Nature, 452, 38. Wang, J. et al. (2010) Mediating effects of smoking and chronic obstr-uc 633–637. tive pulmonary disease on the relation between the CHRNA5-A3 12. Thorgeirsson,T.E. et al. (2008) A variant associated with nicotine depe-nd genetic locus and lung cancer risk. Cancer, 116, 3458–3462. ence, lung cancer and peripheral arterial disease. Nature, 452, 638–642. 39. He, X. et al. (1989) Expression of a large family of POU-domain regu-la 13. Wu, C. et al. (2009) Genetic variants on chromosome 15q25 associated tory genes in mammalian brain development. Nature, 340, 35–41. with lung cancer risk in Chinese populations. Cancer Res., 69, 5065–5072. 40. Okuda, H. et al. (1993) The presence of corticotropin-releasing factor14. Laitinen, V.H. et al.; PRACTICAL Consortium. (2015) Fine-mapping the like immunoreactive synaptic vesicles in axon terminals with nicotinic 2q37 and 17q11.2-q22 loci for novel genes and sequence variants ass-o acetylcholine receptor-like immunoreactivity in the median eminence ciated with a genetic predisposition to prostate cancer. Int. J. Cancer, of the rat. Neurosci. Lett., 161, 183–186. 136, 2316–2327. 41. Bugajski, J. et al. (2002) Involvement of prostaglandins in the nicotine15. Kachuri, L. et  al. (2016) Fine mapping of chromosome 5p15.33 based induced pituitary-adrenocortical response during social stress. J. P-hys on a targeted deep sequencing and high density genotyping identifies iol. Pharmacol., 53(4 Pt 2), 847–857. novel lung cancer susceptibility loci. Carcinogenesis, 37, 96–105. 42. Plaza-Zabala, A. et  al. (2010) Hypocretins regulate the anxiogenic16. Salomon, M.P. et al. (2016) GWASeq: targeted re-sequencing follow up like effects of nicotine and induce reinstatement of nicotine-seeking to GWAS. BMC Genomics, 17, 176. behavior. J. Neurosci., 30, 2300–2310.

This is a preview of a remote PDF:

Cheng, Yang, Wang, Cheng, Zhu, Meng, Dai, Juncheng, Wang, Yuzhuo, Geng, Liguo, Li, Zhihua, Zhang, Jiahui, Ma, Hongxia, Jin, Guangfu, Lin, Dongxin, Hu, Zhibin, Shen, Hongbing. Targeted sequencing of chromosome 15q25 identified novel variants associated with risk of lung cancer and smoking behavior in Chinese, Carcinogenesis, 2017, 552-558, DOI: 10.1093/carcin/bgx025