Whole Exome Sequencing to Identify Genetic Variants Associated with Raised Atherosclerotic Lesions in Young Persons
www.nature.com/scientificreports
OPEN
Received: 25 August 2016
Accepted: 16 May 2017
Published: xx xx xxxx
Whole Exome Sequencing
to Identify Genetic Variants
Associated with Raised
Atherosclerotic Lesions in Young
Persons
James E. Hixson1, Goo Jun 1, Lawrence C. Shimmin1, Yizhi Wang2, Guoqiang Yu2, Chunhong
Mao3, Andrew S. Warren3, Timothy D. Howard4, Richard S. Vander Heide5, Jennifer Van Eyk6,
Yue Wang2 & David M. Herrington7
We investigated the influence of genetic variants on atherosclerosis using whole exome sequencing in
cases and controls from the autopsy study “Pathobiological Determinants of Atherosclerosis in Youth
(PDAY)”. We identified a PDAY case group with the highest total amounts of raised lesions (n = 359) for
comparisons with a control group with no detectable raised lesions (n = 626). In addition to the standard
exome capture, we included genome-wide proximal promoter regions that contain sequences that
regulate gene expression. Our statistical analyses included single variant analysis for common variants
(MAF > 0.01) and rare variant analysis for low frequency and rare variants (MAF < 0.05). In addition,
we investigated known CAD genes previously identified by meta-analysis of GWAS studies. We did
not identify individual common variants that reached exome-wide significance using single variant
analysis. In analysis limited to 60 CAD genes, we detected strong associations with COL4A2/COL4A1
that also previously showed associations with myocardial infarction and arterial stiffness, as well as
coronary artery calcification. Likewise, rare variant analysis did not identify genes that reached exomewide significance. Among the 60 CAD genes, the strongest association was with NBEAL1 that was also
identified in gene-based analysis of whole exome sequencing for early onset myocardial infarction.
Coronary artery disease (CAD) due to atherosclerosis remains a major health burden across the globe.
Atherosclerosis is a life-long process that involves accumulation of lipids, inflammatory cells, and smooth muscle
cells in the intima of the arterial wall to form atherosclerotic lesions that can block blood circulation required
for transport of oxygen and critical nutrients to the heart. Population-based epidemiological studies identified
important risk factors for CAD such as elevated low density lipoprotein cholesterol (LDL-C) and reduced high
density lipoprotein cholesterol (HDL-C). Genetic factors also influence CAD risk, but the identity of the responsible genes still remains unclear. Attempts to identify genes that influence CAD began with association studies of
DNA variants in biological candidate genes from metabolic pathways with known involvement in atherosclerosis
like cholesterol transport and metabolism1. More recently, genetic studies of CAD have relied on genome wide
association studies (GWAS) that test millions of genetic variants (single nucleotide polymorphisms) across the
genome for associations in large case-control studies2.
1
Human Genetics Center, UTHealth School of Public Health, Houston, TX, 77030, USA. 2Department of Electrical
and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, 22203, USA.
3
Biocomplexity Institute of Virginia Tech, Virginia Tech, Blacksburg, VA, 24061, USA. 4Center for Genomics &
Personalized Medicine Research, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA. 5Department
of Pathology, Louisiana State University Health Science Center, New Orleans, LA, 70112, USA. 6Advanced Clinical
BioSystems Research Institute, Heart Institute and Department of Medicine, Cedars-Sinai Medical Center, Los
Angeles, CA, 90048, USA. 7Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC,
27157, USA. Correspondence and requests for materials should be addressed to J.E.H. (email: James.E.Hixson@
uth.tmc.edu)
Scientific Reports | 7: 4091 | DOI:10.1038/s41598-017-04433-x
1
www.nature.com/scientificreports/
European Americans
Control
African Americans
Case
Control
Case
Male
Female
Male
Female
Male
Female
Male
Female
Sample size
207
87
150
42
272
60
137
30
Age
24.52 ± 5.27
28.01 ± 4.20
26.35 ± 5.14
29.10 ± 3.88
27.36 ± 4.24
28.27 ± 4.00
26.95 ± 4.55
28.17 ± 4.22
Total Raised Lesions
0.27 ± 0.76
0.40 ± 0.92
24.52 ± 20.62
33.79 ± 23.27
0.38 ± 0.80
0.21 ± 0.63
30.60 ± 27.10
36.09 ± 34.61
LDL + VLDL
140.33 ± 50.43
133.49 ± 50.64
161.98 ± 61.77
166.17 ± 77.81
122.75 ± 50.85
122.10 ± 37.65
153.80 ± 65.77
151.24 ± 47.66
HDL
48.97 ± 17.89
54.83 ± 20.61
52.97 ± 21.37
56.90 ± 21.04
57.31 ± 23.54
63.62 ± 27.85
55.91 ± 24.13
52.53 ± 19.15
BMI
24.64 ± 4.00
24.41 ± 4.73
26.42 ± 5.81
24.45 ± 6.42
24.76 ± 4.21
25.23 ± 6.14
25.78 ± 6.08
24.71 ± 5.65
Thoracic Aorta FS
17.10 ± 12.35
16.14 ± 11.17
22.69 ± 13.40
24.10 ± 14.91
23.22 ± 14.29
22.67 ± 13.52
29.95 ± 16.08
21.95 ± 12.58
Thoracic Aorta RL
0.01 ± 0.13
0.00 ± 0.00
1.08 ± 3.11
1.19 ± 4.37
0.01 ± 0.12
0.05 ± 0.41
2.01 ± 5.06
4.99 ± 13.34
Abdominal Aorta FS
22.67 ± 17.29
34.38 ± 21.62
29.05 ± 17.89
37.62 ± 16.02
27.42 ± 20.46
36.84 ± 21.35
34.57 ± 20.99
35.58 ± 18.22
Abdominal Aorta RL 0.09 ± 0.45
0.29 ± 0.86
11.81 ± 13.03
22.14 ± 17.05
0.18 ± 0.59
0.08 ± 0.30
14.98 ± 16.19
24.71 ± 18.27
Coronary Artery FS
2.03 ± 3.82
2.73 ± 6.63
7.02 ± 8.50
8.47 ± 11.42
5.25 ± 9.88
3.20 ± 5.02
11.99 ± 15.31
12.98 ± 15.38
Coronary Artery RL
0.16 ± 0.59
0.11 ± 0.38
11.64 ± 16.92
10.46 ± 18.45
0.18 ± 0.50
0.08 ± 0.35
13.61 ± 18.84
6.40 ± 16.49
Table 1. Characteristics of PDAY subjects (FS, fatty streaks; RL, raised lesions).
Despite these efforts, the identification of genes that influence CAD has remained elusive. A major reason
is that clinical CAD is a heterogenous disease, resulting from many different pathophysiologic mechanisms. In
addition, important subclinical measures of CAD like extent of atherosclerosis are difficult to measure in human
populations. To address these problems, a multicenter autopsy study was established to provide direct measurements of atherosclerotic lesions called “Pathobiological Determinants of Youth (PDAY)”. PDAY obtained arterial
measurements of subclinical atherosclerosis in young persons (15–34 years of age) who died of external causes
unrelated to heart disease (e.g., accidents, homicide, suicide). Results of PDAY studies directly demonstrated the
atherogenic effects of exposure to risk factors such as elevated plasma levels of cholesterol and LDL levels, as well
as smoking and hypertension3.
In this study, we are employing PDAY to find genetic variants that are associated with a quantitative measure
of subclinical CAD, the involvement of arterial surfaces with complicated raised lesions. We selected a case group
with the highest amounts of arterial raised lesions for comparisons with a control group with no detectable raised
lesions. Our goal was to identify genetic variants that are e (...truncated)