CRISPR-ERA: a comprehensive design tool for CRISPR-mediated gene editing, repression and activation
Bioinformatics, 31(22), 2015, 3676–3678
doi: 10.1093/bioinformatics/btv423
Advance Access Publication Date: 23 July 2015
Applications Note
Genome analysis
CRISPR-ERA: a comprehensive design tool for
CRISPR-mediated gene editing, repression and
activation
Honglei Liu1,2, Zheng Wei1, Antonia Dominguez3, Yanda Li1,
Xiaowo Wang1,* and Lei S. Qi2,3,4,*
1
Bioinformatics Division, Center for Synthetic and Systems Biology, TNLIST/Department of Automation, Tsinghua
University, Beijing 100084, China, 2Stanford Chemistry, Engineering & Medicine for Human Health (ChEM-H),
3
Department of Bioengineering and 4Department of Chemical and Systems Biology, Stanford University, 443 Via
Ortega, Shriram Center 376, Stanford, CA 94305-4125, USA
*To whom correspondence should be addressed.
Associate Editor: John Hancock
Received on April 13, 2015; revised on July 3, 2015; accepted on July 16, 2015
Abstract
Summary: The CRISPR/Cas9 system was recently developed as a powerful and flexible technology
for targeted genome engineering, including genome editing (altering the genetic sequence) and
gene regulation (without altering the genetic sequence). These applications require the design of
single guide RNAs (sgRNAs) that are efficient and specific. However, this remains challenging, as it
requires the consideration of many criteria. Several sgRNA design tools have been developed for
gene editing, but currently there is no tool for the design of sgRNAs for gene regulation. With accumulating experimental data on the use of CRISPR/Cas9 for gene editing and regulation, we implement a comprehensive computational tool based on a set of sgRNA design rules summarized from
these published reports. We report a genome-wide sgRNA design tool and provide an online website for predicting sgRNAs that are efficient and specific. We name the tool CRISPR-ERA, for clustered regularly interspaced short palindromic repeat-mediated editing, repression, and activation
(ERA).
Availability and implementation: http://CRISPR-ERA.stanford.edu.
Contact: or
Supplementary information: Supplementary data are available at Bioinformatics online.
1 Introduction
The bacterial adaptive immune system, CRISPR (clustered regularly
interspaced short palindromic repeats), was recently developed as a
powerful and multi-purpose technology for genome engineering,
including editing (modifying the genomic sequence) (Cong et al.,
2013; Mali et al., 2013), and regulation (repressing or activating expression of genes) (Gilbert et al., 2013, 2014; Qi et al., 2013). The
system is highly programmable, utilizing a single protein, the nuclease Cas9 for editing or the nuclease-deficient dCas9 for regulation.
A single guide RNA (sgRNA) is required for precise and programmable DNA targeting (Doudna and Charpentier, 2014). Effective
C The Author 2015. Published by Oxford University Press.
V
and specific genome engineering requires careful design of sgRNAs,
which remains a major challenge. Computational tools have been
used to facilitate the design of sgRNAs for CRISPR editing but not
for other applications such as transcriptional regulation. These computational tools should enable automated sgRNA design and offtarget site validation (Bae et al., 2014; Doench et al., 2014; Heigwer
et al., 2014; O’Brien and Bailey, 2014; Xiao et al., 2014). A major
goal of our designer tool is to address the discrepancy for designing
sgRNAs that allow efficient and highly specific repression or activation of genes and for generating genome-wide sgRNA libraries for
genetic screening in different organisms.
3676
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/),
which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact
CRISPR-ERA
3677
Fig. 1. CRISPR-ERA workflow and example. The CRISPR-ERA algorithm takes input information, including types of genome manipulation, organism, and gene
name or genome location, and then computes and evaluates sgRNAs within the targeted genome region. By default, for editing, the algorithm chooses sgRNA
sequences within coding region; for repression, the algorithm computes sgRNA binding sites within a 3 kb region centered at TSS (or the sense strand of the 50
end of the gene for bacteria only); for activation, the algorithm computes sgRNA binding sites up to 1.5 kb upstream of TSS. In this figure, mouse gene Sox2 is
shown as an example. E, efficacy score; S, specificity penalty score (Supplementary Methods)
Here, we describe CRISPR-ERA webserver, an automated and
comprehensive sgRNA design tool for CRISPR-mediated editing,
repression, and activation (ERA) (Fig. 1). CRISPR-ERA utilizes a
fast algorithm to search for genome-wide sgRNA binding sites and
evaluates their efficiency and specificity using a set of rules summarized from published data for CRISPR editing, repression and activation (Cong et al., 2013; Doudna and Charpentier, 2014; Gilbert
et al., 2014; Qi et al., 2013; Ran et al., 2013). The design features
are annotated and the target sites can be visualized in a genome
browser. We also provide a local version for the generation of
whole-genome sgRNA libraries.
2 Methods
For each target gene or genomic site, CRISPR-ERA first searches
all targetable sites in that particular organism for patterns of
N20NGG (N ¼ any nucleotide). Each target sequence is then calculated for two scores (Supplementary Methods): (i) an efficacy score
(E-score) based on the sequence features such as GC content
(%GC), presence of poly-thymidine (which is a terminator for effective transcription of sgRNAs), and location information such as
the distance from target gene transcriptional start sites (TSS); and
(ii) a specificity score (S-score) based on the genome-wide off-target
binding sites. For each sgRNA design, we compute the genomewide sequences that contain an adjacent NRG (R ¼ A or G)
protospacer adjacent motif (PAM) site and zero, one, two, or three
mismatches complementary to the sgRNA using Bowtie (Langmead
et al., 2009), which are regarded as off-target binding sites. The
penalty score for NAG off-target is smaller than NGG off-target.
The sgRNAs are finally ranked by the sum of E-score and S-score
(Fig. 1; Supplementary Fig. S1).
We implement a user-friendly web server (http://CRISPR-ERA.
stanford.edu) that hosts the web application for the sgRNA designer
tool. The webserver will host a broad category of sequenced organisms. Currently, it provides sgRNA design service for nine most
commonly used prokaryotic and eukaryotic organisms including
Escherichia coli, Bacillus subtilis, Saccharomyces cerevisiae,
Drosophila melanogaster, Caenorhabditis elegans, Danio rerio,
Rattus norvegicus, Mus musculus, and Homo sapiens, etc.
(Supplementary Table S1.). The web application enables rapid
searching in the pre-assembled sgRNA (...truncated)