Characterizing the genetic basis of methylome diversity in histologically normal human lung tissue
ARTICLE
Received 4 Oct 2013 | Accepted 31 Jan 2014 | Published 27 Feb 2014
DOI: 10.1038/ncomms4365
Characterizing the genetic basis of methylome
diversity in histologically normal human lung tissue
Jianxin Shi1, Crystal N. Marconett2,3, Jubao Duan4, Paula L. Hyland1, Peng Li1, Zhaoming Wang1,
William Wheeler5, Beiyun Zhou6, Mihaela Campan2,3, Diane S. Lee2,3, Jing Huang7, Weiyin Zhou1, Tim Triche8,
Laufey Amundadottir1, Andrew Warner9, Amy Hutchinson1, Po-Han Chen2,3, Brian S.I. Chung2,3,
Angela C. Pesatori10, Dario Consonni10, Pier Alberto Bertazzi10, Andrew W. Bergen11, Mathew Freedman12,13,
Kimberly D. Siegmund8, Benjamin P. Berman8,14, Zea Borok3,6, Nilanjan Chatterjee1, Margaret A. Tucker1,
Neil E. Caporaso1, Stephen J. Chanock1, Ite A. Laird-Offringa2,3 & Maria Teresa Landi1
The genetic regulation of the human epigenome is not fully appreciated. Here we describe the
effects of genetic variants on the DNA methylome in human lung based on methylationquantitative trait loci (meQTL) analyses. We report 34,304 cis- and 585 trans-meQTLs, a
genetic–epigenetic interaction of surprising magnitude, including a regulatory hotspot. These
findings are replicated in both breast and kidney tissues and show distinct patterns:
cis-meQTLs mostly localize to CpG sites outside of genes, promoters and CpG islands (CGIs),
while trans-meQTLs are over-represented in promoter CGIs. meQTL SNPs are enriched in
CTCF-binding sites, DNaseI hypersensitivity regions and histone marks. Importantly, four of
the five established lung cancer risk loci in European ancestry are cis-meQTLs and, in
aggregate, cis-meQTLs are enriched for lung cancer risk in a genome-wide analysis of 11,587
subjects. Thus, inherited genetic variation may affect lung carcinogenesis by regulating the
human methylome.
1 Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, DHHS, Bethesda, Maryland 20892, USA. 2 Department of Surgery,
USC/Norris Comprehensive Cancer Center, Keck School of Medicine, Los Angeles, California 90089, USA. 3 Department of Biochemistry and Molecular
Biology, USC/Norris Comprehensive Cancer Center, Keck School of Medicine, Los Angeles, California 90089, USA. 4 Center for Psychiatric Genetics,
Department of Psychiatry and Behavioral Sciences, North Shore University Health System Research Institute, University of Chicago Pritzker School of
Medicine, Evanston, Illinois 60201, USA. 5 Information Management Services Inc., Rockville, Maryland 20852, USA. 6 Will Rogers Institute Pulmonary
Research Center, Division of Pulmonary, Critical Care and Sleep Medicine, USC Keck School of Medicine, Los Angeles, California 90089, USA. 7 Laboratory of
Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, NIH, DHHS, Bethesda, Maryland 20892, USA. 8 Bioinformatics Division,
Department of Preventive Medicine, University of Southern California, Los Angeles, California 90089, USA. 9 Pathology/Histotechnology Laboratory,
Laboratory Animal Sciences Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, USA. 10 Unit of Epidemiology, IRCCS
Fondazione Ca’ Granda Ospedale Maggiore Policlinico, Department of Clinical Sciences and Community Health, University of Milan, Milan 20122, Italy.
11 Molecular Genetics Program, Center for Health Sciences, SRI, Menlo Park, California 94025, USA. 12 Program in Medical and Population Genetics, The
Broad Institute, Cambridge, Massachusetts 02142, USA. 13 Department of Medical Oncology, The Center for Functional Cancer Epigenetics, Dana-Farber
Cancer Institute, Boston, Massachusetts 02215, USA. 14 USC Epigenome Center and USC/Norris Comprehensive Cancer Center, Los Angeles, California
90089, USA. Correspondence and requests for materials should be addressed to M.T.L. (email: ).
NATURE COMMUNICATIONS | 5:3365 | DOI: 10.1038/ncomms4365 | www.nature.com/naturecommunications
& 2014 Macmillan Publishers Limited. All rights reserved.
1
ARTICLE
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms4365
D
NA methylation plays a central role in epigenetic
regulation. Twin studies have suggested that DNA
methylation at specific CpG sites can be heritable1,2;
however, the genetic effects on DNA methylation have been
investigated only in brain tissues3,4, adipose tissues5,6 and
lymphoblastoid cell lines7. Most studies were based on the
Illumina HumanMethylation27 array, which has a low density
and mainly focuses on CpG sites mapping to gene promoter
regions. While the functional role of DNA methylation in nonpromoter or non-CpG Island (CGI) regions remains largely
unknown, evidence shows roles in regulating gene splicing8 and
alternative promoters9, silencing of intragenic repetitive DNA
sequences10, and predisposing to germline and somatic mutations
that could contribute to cancer development11,12. Notably, a
recent study13 suggests that most DNA methylation alterations in
colon cancer occur outside of promoters or CGIs, in so called
CpG island shores and shelves, and the Cancer Genome Project
has reported high mutation rates in CpG regions outside CGI in
multiple cancers14. Although expression QTLs (eQTLs) have
been extensively studied in different cell lines and tissues15, the
minimal overlap observed between cis-acting meQTLs and eQTLs
(E5–10%)3,4,7 emphasizes the necessity of mapping meQTLs
that may function independently of nearby gene expression. This
might reveal novel mechanisms for genetic effects on cancer risk,
particularly since many of the established cancer susceptibility
SNPs map to non-genic regions.
Lung diseases constitute a significant public health burden.
About 10 million Americans had chronic obstructive pulmonary
disease in 2012 (ref. 16) and lung cancer continues to be the
leading cancer-related cause of mortality worldwide17. To provide
functional annotation of SNPs, particularly those relevant
to lung diseases and traits, we systematically mapped
meQTLs in 210 histologically normal human lung tissues using
Illumina Infinium HumanMethylation450 BeadChip arrays,
which provide a comprehensive platform to interrogate the
DNA methylation status of 485,512 cytosine targets with excellent
coverage in both promoter and non-promoter regions (Fig. 1a),
CGI and non-CGI regions (Fig. 1b) and gene and non-gene
regions. Thus, our study enables the characterization of genetic
effects across the methylome in unprecedented detail. Moreover,
since DNA methylation exhibits tissue-specific features18, we
investigated whether similar meQTLs could be identified in other
tissues.
Results
Identification of cis-acting meQTLs. We profiled DNA methylation for 244 fresh-frozen histologically normal lung samples
from non-small cell lung cancer (NSCLC) patients from the
Environment and Genetics in Lung cancer Etiology (EAGLE)
study19. A subset of 210 tissue samples that passed quality control
and had germline genotype data from blood samples20 was used
for meQTL analysis. The analysis was restricted to 338,456
autosomal CpG probe (...truncated)