Genetic variants and risk of gastric cancer: a pathway analysis of a genome-wide association study
Lee et al. SpringerPlus
Genetic variants and risk of gastric cancer: a pathway analysis of a genome-wide association study
Ju-Han Lee 0
Younghye Kim 0
Jung-Woo Choi
Young-Sik Kim
0 Equal contributors Department of Pathology, Korea University Ansan Hospital , 123, Jeokgeum-Ro, Danwon-Gu, Ansan-Si, Gyeonggi-Do 425-707 , Republic of Korea
This study aimed to discover candidate single nucleotide polymorphisms (SNPs) for hypothesizing significant biological pathways of gastric cancer (GC). We performed an Identify Candidate Causal SNPs and Pathways (ICSNPathway) analysis using a GC genome-wide association study (GWAS) dataset, including 472,342 SNPs in 2,240 GC cases and 3,302 controls of Asian ethnicity. By integrating linkage disequilibrium analysis, functional SNP annotation, and pathway-based analysis, seven candidate SNPs, four genes and 12 pathways were selected. The ICSNPathway analysis produced 4 hypothetical mechanisms of GC: (1) rs4745 and rs12904 EFNA1 ephrin receptor binding; (2) rs1801019 UMPS drug and pyrimidine metabolism; (3) rs364897 GBA cyanoamino acid metabolism; and (4) rs11187870, rs2274223, and rs3765524 PLCE1 lipid biosynthetic process, regulation of cell growth, and cation homeostasis. This pathway analysis using GWAS dataset suggests that the 4 hypothetical biological mechanisms might contribute to GC susceptibility.
Genome-wide association study; Pathway-based analysis; Gastric cancer
-
Introduction
Despite a decline in its incidence, gastric cancer (GC) is
still the second most common cause of cancer-related
death worldwide (Hohenberger and Gretschel 2003).
Furthermore, GC remains one of the most prevalent
high-mortality cancers in Northeast Asia (Hohenberger
and Gretschel 2003). Helicobacter pylori infection is the
strongest risk factor for GC (Polk and Peek 2010), but
only a small proportion of infected individuals develop
malignancy. Thus, genetic factors such as polymorphisms
in GC-related genes, in addition to dietary factors and
environmental factors, substantially contribute to GC
susceptibility (Milne et al. 2009).
Genome-wide association studies (GWAS) have proved
successful in identifying associations between specific
genes and complex diseases (Manolio 2010), and opened a
new phase in researching the genetic causes of disease.
Furthermore, GWAS datasets are increasingly being used
to recognize the biological pathways underlying complex
diseases (Ramanan et al. 2012), because the functional
pathway analysis using genomic datasets has high
statistical power to detect the biological mechanisms of disease
causation (Ramanan et al. 2012).
Recently, (Zhang et al. 2011a) developed the pathway
analysis tool called Identify Candidate Causal SNPs and
Pathways (ICSNPathway) analysis. This method
highlights the candidate SNPs and their corresponding
candidate pathways from GWAS data by integrating linkage
disequilibrium (LD) analysis, functional SNP annotation,
and pathway-based analysis (PBA) (Zhang et al. 2011a).
The ICSNPathway analysis provides candidate SNPs and
their corresponding candidate pathways using GWAS
data, thereby making it easier to link variants to
biological mechanisms.
We conducted ICSNPathway analysis using a GC
GWAS dataset available online to identify candidate
SNPs and promising biological mechanisms that contribute
to GC susceptibility.
Methods
GWAS dataset
The GC GWAS dataset is publicly available from the
NCBI dbGap (http://www.ncbi.nlm.nih.gov/gap). The
dataset includes genotypes of 472,342 SNPs on Illumina
660 W Quad chip from 2,240 GC cases and 3,302
controls of Chinese ethnicity (Abnet et al. 2010; Li et al.
2013). Study participants were drawn from the Shanxi
Upper Gastrointestinal Cancer Genetics Project and the
Linxian Nutrition Intervention Trial, which included a
total of 1,625 GC cases and 2,100 controls. Six hundred
and fifteen GC cases and 1,202 controls from the Shanghai
Mens Health Study, the Shanghai Womens Health Study,
and the Singapore Chinese Health Study were also
included in the database. Controls were matched for age
(5 years), sex, and geographical location and they were
all cancer-free at the time of enrollment (Abnet et al.
2010; Li et al. 2013). The dataset was filtered to prevent
genotyping errors. The SNPs were excluded if they
showed a call rate lower than 90% in cases or controls or
significant deviation from Hardy-Weinberg equilibrium in
the controls (P < 104). Finally, 470,698 SNPs were left for
downstream pathway analysis.
ICSNPathway analysis
We conducted ICSNPathway analysis using the GC
GWAS dataset in two-stages (Zhang et al. 2011a). First,
candidate causal SNPs were pre-selected by LD analysis
and the most significant functional SNPs were
annotated. Next, biological mechanisms for the pre-selected
candidate causal SNPs were found using PBA. A full list
of GWAS SNP P-values was used for the ICSNPathway
analysis. The ICSNPathway analysis is based on LD analysis
and the discovery of functional SNPs using i (...truncated)