CRISPR–Cas9-mediated genome editing and guide RNA design
CRISPR-Cas9-mediated genome editing and guide RNA design
Michael V. Wiles 0 1
Wenning Qin 0 1
Albert W. Cheng 0 1
Haoyi Wang 0 1
0 State Key Laboratory of Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences , Beijing , People's Republic of China
1 The Jackson Laboratory , 600 Main Street, Bar Harbor, ME 04609-1500 , USA
CRISPR and CRISPR-associated (Cas) proteins, which in nature comprise the RNA-based adaptive immune system in bacteria and archaea, have emerged as particularly powerful genome editing tools owing to their unrivaled ease of use and ability to modify genomes across mammalian model systems. As such, the CRISPR-Cas9 system holds promise as a ''system of choice'' for functional mammalian genetic studies across biological disciplines. Here we briefly review this fast moving field, introduce the CRISPR-Cas9 system and its application to genome editing, with a focus on the basic considerations in designing the targeting guide RNA sequence.
-
Introduction
Site-directed DNA endonucleases are powerful tools for
genome editing. When introduced into cells, these proteins
can bind to a target DNA sequence in the genome and
create a DNA double-strand break (DSB), the repair of
which leads to varied DNA sequence modifications. The
initial efforts on developing these tools were focused on
engineering homing endonucleases (Silva et al. 2011) and
zinc finger nucleases (ZFN) (Urnov et al. 2005, 2010), and
later Transcription Activator-Like Effector Nucleases
(TALEN) (Boch et al. 2009; Moscou and Bogdanove 2009;
& Haoyi Wang
;
CRISPRCas9-mediated genome editing
The CRISPRCas system was first described in the genome
of Escherichia coli as a cluster of short palindromic repeats
separated by peculiar short spacer sequences (Ishino et al.
1987). Subsequently, it was shown that CRISPR loci are
present in the genomes of more than 40 % of bacteria and
90 % of archaea (Horvath and Barrangou 2010) and their
function is to serve as an adaptive immune defense
mechanism, protecting against phage infection by
recognizing and cleaving pathogen DNA (Horvath and
Barrangou 2010; Fineran and Charpentier 2012). By 2012,
the basic mechanism of CRISPRCas9 derived from
Streptococcus pyogenes was elucidated (Deltcheva et al.
2011; Jinek et al. 2012). CRISPRCas9 is an RNA-guided
DNA endonuclease system in which Cas9 endonuclease
forms a complex with two naturally occurring RNA
species, CRISPR RNA (crRNA) and trans activating CRISPR
RNA (tracrRNA). This complex targets specific DNA
sequences complementary to the 20 nt (nucleotide) sequence
residing at the 50 end of the crRNA (Jinek et al. 2012).
Conveniently, crRNA and tracrRNA can be linked by an
arbitrary stem loop sequence to generate a synthetic
singleguide RNA (sgRNA). Although naturally evolving as a
system in bacteria, upon appropriate codon optimization of
the Cas9 coding sequence, CRISPRCas9 is highly active
in mammalian cells (Cho et al. 2013; Cong et al. 2013;
Jinek et al. 2013; Mali et al. 2013b).
In practice, by simply designing the 50 20 nt sequence on
the sgRNA to be complementary to the genomic target
sequence, the Cas9 nuclease-sgRNA complex can be
directed to specific genomic locus generating DNA DSBs.
The target defining region of the sgRNA is about 20 nt
long, with variations from 17 to 30 nt having been
successfully used (Ran et al. 2013; Fu et al. 2014). The other
key element in determining target sequence specificity is
the Protospacer Adjacent Motif (PAM) that is adjacent to
the target site at the genome locus, but is not a part of the
guide RNA sequence (see Fig. 1). For Cas9 nuclease from
S. pyogenes, the PAM sequence is NGG, while CRISPR
Cas9 systems from other species use different PAM
sequences (Cong et al. 2013; Esvelt et al. 2013; Hou et al.
2013). In bacteria, the PAM is thought to effectively
distinguish self, with the PAM not being present in the
genomic CRISPR loci, from the invading phage, whose
genome carries the PAM sequence adjacent to the target
sequence (Marraffini and Sontheimer 2010).
CRISPRCas9-mediated DNA DSBs are repaired
through either the Non-Homologous End Joining (NHEJ)
repair process, or the homology-directed repair (HDR)
pathway. NHEJ repair often leads to small insertions or
deletions (indels) at the targeted site, while HDR pathway
leads to perfect repair or precise genetic modification (see
Fig. 1) (Doudna and Charpentier 2014; Hsu et al. 2014).
Through these two DNA repair pathways, various genetic
modifications can be achieved (Fig. 1). The
NHEJ-mediated DNA repair pathway can be exploited to generate null
mutation alleles. Indel mutations generated at a target site
within an exon can lead to frame shift mutations in one or
both alleles. One major advantage of the CRISPRCas9
system, as compared to conventional gene targeting and
other programmable endonucleases, is the ease of
multiplexing, where multiple genes can be mutated
simultaneously simply by using multiple sgRNAs each targeting a
different gene (Wang et al. 2013a, b). In addition, when
two sgRNAs are used flanking a genomic region, the
intervening region can be deleted or inverted (Blasco et al.
2014; Canver et al. 2014; He et al. 2015). Furthermore,
chromosomal translocation can also be achieved by using
two sgRNAs targeting two genomic loci located on
different chromosomes (Choi and Meyerson 2014).
Fig. 1 CRISPRCas9-mediated
genome editing. a The structure
of Cas9sgRNA complex
binding to target DNA. Cas9
binds to specific DNA
sequences via the base-pairing
of the guide sequence on
sgRNA (pink) with the DNA
target (gray). Protospacer
adjacent motif (PAM) is
downstream of the target
sequence. b The
CRISPRCas9mediated double-stranded DNA
breaks are repaired by
endogenous DNA repair
machinery: non-homologous
end joining (NHEJ) or
homology-directed repair
(HDR). Various genetic
modifications can be generated
through these two pathways
When a DSB is generated and a donor DNA template is
provided, precise genetic modification can be introduced
through the HDR pathway (Fig. 1). For small
modifications, including incorporation of point mutations, defined
indel mutations, as well as insertion of a short sequence
such as a loxP site or an epitope tag, single-stranded
oligodeoxynucleotide (ssODN) can be used as donor DNA.
In this design, donor ssODN is designed to carry
homologous sequences flanking the mutation and total size
can be up to 200 nt. HDR efficiency does not appear to be
directly correlated with donor homology lengths (Yang
et al. 2013b), and HDR efficiency variation is likely due to
the nature of the target genomic loci, which is still poorly
understood. When DNA of larger sizes is to be introduced
into a target site, a double-stranded donor plasmid carrying
the transgene flanked by homologous arms is used (Yang
et al. 2013a).
Because of the ease of use, CRISPRCas9 system has
swiftly become the most commonly used tool for efficient
genome editing of bacteria, plants, cell lines, primary cells,
a (...truncated)