Genetic analysis of DNA methylation and gene expression levels in whole blood of healthy human subjects
BMC Genomics
Genetic analysis of DNA methylation and gene expression levels in whole blood of healthy human subjects
Kristel R van Eijk 1
Simone de Jong 2
Marco PM Boks 1
Terry Langeveld
Fabrice Colas
Jan H Veldink
Carolien GF de Kovel
Esther Janson
Eric Strengman 2
Peter Langfelder
Ren S Kahn 1
Leonard H van den Berg
Steve Horvath
Roel A Ophoff 1 2
0 4357C , 695 Charles E. Young Drive, South Los Angeles, CA 90095-1761 , USA
1 Department of Psychiatry, Rudolf Magnus Institute of Neuroscience, University Medical Center Utrecht , Utrecht 3508, GA , The Netherlands
2 Center for Neurobehavioral Genetics, University of California Los Angeles , Box 951761 Gonda
Background: The predominant model for regulation of gene expression through DNA methylation is an inverse association in which increased methylation results in decreased gene expression levels. However, recent studies suggest that the relationship between genetic variation, DNA methylation and expression is more complex. Results: Systems genetic approaches for examining relationships between gene expression and methylation array data were used to find both negative and positive associations between these levels. A weighted correlation network analysis revealed that i) both transcriptome and methylome are organized in modules, ii) co-expression modules are generally not preserved in the methylation data and vice-versa, and iii) highly significant correlations exist between co-expression and co-methylation modules, suggesting the existence of factors that affect expression and methylation of different modules (i.e., trans effects at the level of modules). We observed that methylation probes associated with expression in cis were more likely to be located outside CpG islands, whereas specificity for CpG island shores was present when methylation, associated with expression, was under local genetic control. A structural equation model based analysis found strong support in particular for a traditional causal model in which gene expression is regulated by genetic variation via DNA methylation instead of gene expression affecting DNA methylation levels. Conclusions: Our results provide new insights into the complex mechanisms between genetic markers, epigenetic mechanisms and gene expression. We find strong support for the classical model of genetic variants regulating methylation, which in turn regulates gene expression. Moreover we show that, although the methylation and expression modules differ, they are highly correlated.
DNA methylation; Gene expression; Association; Epigenetics; WGCNA
-
Background
Epigenetics has been described as the structural
adaptation of chromosomal regions so as to register, signal or
perpetuate altered activity states [1]. DNA methylation is
one of several forms of epigenetic modifications and
involves the covalent binding of a methyl group to a
Cytosine-5 at a C-phosphate-G (CpG) site. These sites
are relatively rare in the genome but more common at
promoter regions of genes, also called CpG islands
(CGIs). CpGs in these islands are less likely to be
methylated than CpGs outside these islands. Recent studies
have shown that specifically the CpGs in the shore of
CGIs are most frequently involved in differential
methylation between tissues or experimental groups [2,3].
Increased methylation of CpG islands at 5 end of a gene
is associated with gene repression. Possible mechanisms
for repression include interference with transcription
factor binding or through the recruitment of repressors
such as histone deacetylases [4].
Although one would expect DNA methylation at CGIs
and expression of the nearby gene to be inversely
correlated, this is not necessarily the case. Recent reports
also identified positive associations between expression
and methylation levels [5-7]. However, negative
associations between methylation and expression were found to
be enriched particularly in CGIs [6] and promoter
regions [5].
Around 30% of gene expression levels in cell lines [8]
and 23% of DNA methylation levels in blood are
heritable [9] and genetic variation associated with expression
and methylation levels has been identified in several
organisms [6,10-12], tissues [13] and populations [14].
Local (cis) and distal (trans) associations of genetic
variation with gene expression levels have been observed.
With the arrival of high-throughput DNA methylation
assays, methylation quantitative trait loci (mQTLs) can
now be studied genome-wide in any tissue or cell type of
interest. Similar to expression (eQTLs), more cis than
trans regulation has been identified [5-7] but peak
enrichment for mQTLs is located in much closer proximity
to transcription start sites than that of eQTLs [6].
Attempts to identify three-way associations between
genetic variants, expression and methylation on a
genome-wide scale in four different brain regions did
not identify co-regulation of methylation and expression
by the same genetic variants [6], while a study of
cerebellar samples did identify three-way associations for a
number of genes [7]. In lymphoblastoid cell lines of 77
individuals of the Yoruba Hapmap population,
coregulation of expression and methylation levels by the
same genetic variants was also found, suggesting a
shared mechanism, whereby a genetic variant influences
methylation, which in turn influences expression levels
[5]. Strong evidence exists that both patterns of CpG
methylation [15,16] and gene expression [13,17,18] differ
between tissues.
The aims of the current study are i) to relate
expression levels to methylation levels, ii) to relate
coexpression modules (clusters of expression probes) to
co-methylation modules, iii) and to study the
relationship between genetic markers, methylation and
expression in whole blood of a relatively large (n=148) set of
healthy human subjects. For the genetic analysis, we
examined the associations of methylation and expression
levels and identified genetic markers associated with
these levels. To infer directionality in the relationships
between genetic variants, methylation and expression,
we calculated local edge orienting (LEO) scores based
on structural equation models [19]. This method has
been applied successfully before and will aid in
elucidating the nature of relationship between genetic variation,
methylation and expression [20-23].
Results
Associations between methylation and expression levels
A multivariate linear model analysis for regressing a
gene expression level on a methylation level and age and
gender resulted in the identification of 522 negative
and 276 positive cis associations between methylation
and expression levels (False Discovery Rate (FDR) 5%
corrected). A negative association between methylation
and transcript level means that increased methylation
levels correlate with decreased expression levels, whereas
a positive correlation includes levels that both increase
or decrease. These associations involved 517 different
cis-acting CpG loci (...truncated)