Analysis of the ABCA4 genomic locus in Stargardt disease
Human Molecular Genetics, 2014, Vol. 23, No. 25
doi:10.1093/hmg/ddu396
Advance Access published on July 31, 2014
6797–6806
Analysis of the ABCA4 genomic locus
in Stargardt disease
Jana Zernant1, Yajing (Angela) Xie1, Carmen Ayuso3,4, Rosa Riveiro-Alvarez3,4, Miguel-Angel
Lopez-Martinez3,4, Francesca Simonelli5, Francesco Testa5, Michael B. Gorin6,7, Samuel P.
Strom6,7,8, Mette Bertelsen9, Thomas Rosenberg9, Philip M. Boone10, Bo Yuan10, Radha Ayyagari11,
Peter L. Nagy2, Stephen H. Tsang1,2, Peter Gouras1, Frederick T. Collison12, James R. Lupski10,
Gerald A. Fishman12 and Rando Allikmets1,2,∗
1
Department of Ophthalmology and 2Department of Pathology and Cell Biology, Columbia University, New York, NY, USA,
Department of Genetics, Instituto de Investigacion Sanitaria-University Hospital Fundacion Jimenez Diaz, UAM (IIS-FJD),
Madrid, Spain, 4Centro de Investigacion Biomedica en Red (CIBER) de Enfermedades Raras, ISCIII, Madrid, Spain, 5Eye
Clinic, Multidisciplinary Department of Medical, Surgical and Dental Sciences, Second University of Naples, Naples, Italy,
6
Department of Ophthalmology, 7Department of Human Genetics, Jules Stein Eye Institute and 8Department of Pathology,
David Geffen School of Medicine at UCLA, Los Angeles, CA, USA, 9Kennedy Center Eye Clinic, Glostrup Hospital, Glostrup,
Denmark, 10Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA, 11Department
of Ophthalmology, University of California San Diego, La Jolla, CA, USA and 12The Pangere Center for Hereditary Retinal
Diseases, The Chicago Lighthouse for People Who are Blind or Visually Impaired, Chicago, IL, USA
3
Autosomal recessive Stargardt disease (STGD1, MIM 248200) is caused by mutations in the ABCA4 gene.
Complete sequencing of ABCA4 in STGD patients identifies compound heterozygous or homozygous disease-associated alleles in 65 – 70% of patients and only one mutation in 15 – 20% of patients. This study was
designed to find the missing disease-causing ABCA4 variation by a combination of next-generation sequencing
(NGS), array-Comparative Genome Hybridization (aCGH) screening, familial segregation and in silico analyses.
The entire 140 kb ABCA4 genomic locus was sequenced in 114 STGD patients with one known ABCA4 exonic
mutation revealing, on average, 200 intronic variants per sample. Filtering of these data resulted in 141 candidates for new mutations. Two variants were detected in four samples, two in three samples, and 20 variants in
two samples, the remaining 117 new variants were detected only once. Multimodal analysis suggested 12 new
likely pathogenic intronic ABCA4 variants, some of which were specific to (isolated) ethnic groups. No copy
number variation (large deletions and insertions) was detected in any patient suggesting that it is a very rare
event in the ABCA4 locus. Many variants were excluded since they were not conserved in non-human primates,
were frequent in African populations and, therefore, represented ancestral, and not disease-associated, variants. The sequence variability in the ABCA4 locus is extensive and the non-coding sequences do not harbor frequent mutations in STGD patients of European-American descent. Defining disease-associated alleles in the
ABCA4 locus requires exceptionally well characterized large cohorts and extensive analyses by a combination
of various approaches.
INTRODUCTION
Mutations in the ABCA4 gene are responsible for a wide variety
of retinal dystrophy phenotypes from autosomal recessive
Stargardt disease (STGD1) (1) to cone – rod dystrophy (CRD)
(2,3) and, in some advanced cases, retinitis pigmentosa (RP)
(2,4,5). While CRD and RP phenotypes are also caused by mutations in many other genes, ABCA4 is the only recognized gene
∗
To whom correspondence should be addressed. Email:
# The Author 2014. Published by Oxford University Press. All rights reserved.
For Permissions, please email:
Received May 29, 2014; Revised May 29, 2014; Accepted July 29, 2014
6798
Human Molecular Genetics, 2014, Vol. 23, No. 25
RESULTS
Discovery of new disease-associated variants
by next-generation sequencing
Sequencing of the entire ABCA4 genomic locus, at an average
depth of coverage of 100×, in 130 patients with ABCA4associated disease harboring one previously known ABCA4
disease-associated allele, and 6 patients with no known
ABCA4 mutations, resulted in detecting 1745 different variants.
Eighty-three of these were previously known disease-associated
or benign variants from coding regions and pathogenic splice site
variants. Six hundred and ninety-five (695) variants were also
detected in 1000 Genomes Project or Exome Sequencing
Project, with no statistically significant differences in allele frequencies between the general population and the patient cohort,
unless the variants were on the same allele (haplotype) with the
frequent known ABCA4 coding mutation, p.G1961E. Five
hundred and twenty-six (526) variants were incorrectly called
deletions or insertions from single nucleotide repeat areas
(homopolymers) that have proven to be difficult for the NGS approach. We also experienced a relatively high A.C/C.A/
T.G/G.T false-positive calling rate with Illumina sequencing. The number of false positives can be reduced by more stringent criteria for variant calling; however, this may also exclude
some real variants. After the filtering and verification steps 141
new intronic ABCA4 variants remained in 114 patients. In 22
patients with one previously known ABCA4 mutation, the
second pathogenic ABCA4 allele was also found in the coding sequence or adjacent splice sites. In 6/22 cases this was due to reevaluation of several variants which had been classified as
benign, e.g. p.G991R and p.A1773V. The remaining 16 cases
represented false-negative results, probably due to technical
reasons in the first, sequencing, step of the ABCA4 coding
regions.
Of the 141 new possible candidates for disease-associated variants, two variants, c.4539+2064C.T and c.5461-1389C.A,
were detected together (one the same chromosome) as a
complex allele in four patients of Spanish or Italian descent
(Table 1). The c.5461-1389C.A variant is in an evolutionarily
less conserved area, the c.4539+2064C.T variant is adjacent
to the recently reported c.4539+2028C.T and c.4539+
2001G.A variants from a conserved area (14). According to
predictive programs, none of these variants have any effect on
splicing, whether on existing cryptic splice sites or on creating
new sites. The c.4539+2064C.T and c.5461-1389C.A
haplotype segregated with the disease in all three STGD1 families from Spain (Fig. 1A – C), and were absent in 100 matched
Spanish control samples, making these variants very likely candidates for intronic ABCA4 mutations.
Two variants, c.5196+1056A.G and c.6006-609T.A,
were detected in 3/114 unrelated patients each and were absent
in 368 matched control samples. The c.5196+1056A.G
variant segregated with the disease in two families; i.e. it was
on (...truncated)