CRISPRcompar: a website to compare clustered regularly interspaced short palindromic repeats

Nucleic Acids Research, Jul 2008

Clustered regularly interspaced short palindromic repeat (CRISPR) elements are a particular family of tandem repeats present in prokaryotic genomes, in almost all archaea and in about half of bacteria, and which participate in a mechanism of acquired resistance against phages. They consist in a succession of direct repeats (DR) of 24–47 bp separated by similar sized unique sequences (spacers). In the large majority of cases, the direct repeats are highly conserved, while the number and nature of the spacers are often quite diverse, even among strains of a same species. Furthermore, the acquisition of new units (DR + spacer) was shown to happen almost exclusively on one side of the locus. Therefore, the CRISPR presents an interesting genetic marker for comparative and evolutionary analysis of closely related bacterial strains. CRISPRcompar is a web service created to assist biologists in the CRISPR typing process. Two tools facilitates the in silico investigation: CRISPRcomparison and CRISPRtionary. This website is freely accessible at http://crispr.u-psud.fr/CRISPRcompar/.

Article PDF cannot be displayed. You can download it here:

https://nar.oxfordjournals.org/content/36/suppl_2/W145.full.pdf

CRISPRcompar: a website to compare clustered regularly interspaced short palindromic repeats

Published online 28 April 2008 Nucleic Acids Research, 2008, Vol. 36, Web Server issue W145–W148 doi:10.1093/nar/gkn228 CRISPRcompar: a website to compare clustered regularly interspaced short palindromic repeats Ibtissem Grissa1,*, Gilles Vergnaud1,2 and Christine Pourcel1 1 Univ. Paris-Sud 11, CNRS, UMR8621, Institut de Génétique et Microbiologie, 91405 Orsay and 2DGA/D4S Mission pour la Recherche et l’Innovation Scientifique, 7, rue des Mathurins, 00470 Armées, France Received January 25, 2008; Revised April 6, 2008; Accepted April 11, 2008 ABSTRACT INTRODUCTION The clustered regularly interspaced short palindromic repeat (CRISPR)-associated system (CASS) comprises the particular repeated element CRISPR itself, the promoter for its transcription (also called the leader) and a set of cas genes responsible for its maintenance and function (1,2). It is found in most Archea and 40% bacteria, and is linked to a mechanism of acquired resistance against bacteriophages (3). Some genomes harbour a significant number of CRISPRs [18 in Methanocaldococcus jannaschii DSM 2661with three different direct repeats (DRs)] (4). When different CRISPRs with the same DR are present in a genome, they have a very similar leader, generally different *To whom correspondence should be addressed. Tel: +33 1 69 15 30 01; Fax: +33 1 69 15 66 78; Email: ß 2008 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. Clustered regularly interspaced short palindromic repeat (CRISPR) elements are a particular family of tandem repeats present in prokaryotic genomes, in almost all archaea and in about half of bacteria, and which participate in a mechanism of acquired resistance against phages. They consist in a succession of direct repeats (DR) of 24–47 bp separated by similar sized unique sequences (spacers). In the large majority of cases, the direct repeats are highly conserved, while the number and nature of the spacers are often quite diverse, even among strains of a same species. Furthermore, the acquisition of new units (DR + spacer) was shown to happen almost exclusively on one side of the locus. Therefore, the CRISPR presents an interesting genetic marker for comparative and evolutionary analysis of closely related bacterial strains. CRISPRcompar is a web service created to assist biologists in the CRISPR typing process. Two tools facilitates the in silico investigation: CRISPRcomparison and CRISPRtionary. This website is freely accessible at http://crispr.u-psud.fr/ CRISPRcompar/. spacers, and only one is associated with cas genes (5). When CRISPRs from different CRISPR families exist in the same genome, one set of cas genes specific for each family is present. Finally, within a species, different strains may have different CRISPRs. The example of the three sequenced strains of Streptococcus thermophilus is very illustrative of this situation, since three CRISPRs were identified in this species but only strain LMD-9 possesses the three of them (4). CRISPRs evolve either by deletion or acquisition of units (a DR and a spacer) following a mechanism proposed firstly by Pourcel et al. (6) and recently confirmed (7–9). In the majority of cases, new units are added at one end of the CRISPR adjacent to the leader, whereas motif deletions can occur randomly. The independent acquisition of the same spacer twice is possible but is not frequent and easily detected. Thus, the presence of identical spacers in the same CRISPR locus in distinct strains reflects shared ancestry. The polymorphism of CRISPRs can be used for molecular typing. The standard and classical technology developed for Mycobacterium tuberculosis typing (10) is the spoligotyping, which consists in detecting the presence/ absence of a range of spacers. This technique and other PCR-based typing methods have been applied in CRISPR genotyping to study other bacterial species (6,11–16). We recently implemented a program (CRISPRFinder) allowing the identification of a CRISPR structure based on a thorough characterization of its components, i.e. the DR and the spacers (17). Using this program, public genome sequences are analysed and the extracted CRISPRs are stored into a database (CRISPRdb) (4). CRISPRFinder and CRISPRdb are accessible on the web together with different tools that assist in recovering spacers and DR sequences, and blasting them against Genbank. We now report on the development of a new website dedicated to the comparison of CRISPRs between strains and the labelling of spacers when multiple alleles are analysed. CRISPRcompar is freely accessible at http://crispr. u-psud.fr/CRISPRcompar/index.php. W146 Nucleic Acids Research, 2008, Vol. 36, Web Server issue METHODS AND IMPLEMENTATION Input The CRISPRcompar program automatically recovers from CRISPRdb all strains containing a CRISPR and proposes to compare each of them using the alphabetic list (alternatively, all strains from a given genus can be selected at once using the ‘strain taxonomy browser’). To compare unpublished sequences and genomes, a private database on the model of CRISPRdb (4) must first be created (http://crispr.u-psud.fr/CRISPRcompar/private/). Additional sequences from the private database can then Output For the CRISPRcomparison application, the result is shown in a table where CRISPRs are grouped. Figure 1 shows the result of the comparison of three S. thermophilus strains. Information is given on the CRISPR position and on the number of repeats (Figure 1A). A link to the corresponding CRISPR in CRISPRdb can be activated. When two or more alleles of a given CRISPR are found, the flanking sequences can be aligned and a link is provided to the second application ‘CRISPRtionary’ to annotate and classify the spacers. By activating the ‘compare spacer’ button a table is shown in which the CRISPR sequences are provided in fasta format (Figure 1B). At this step, it is possible to upload a previous dictionary of spacers to which the spacers of the new CRISPR alleles will be compared. If no pre-determined dictionary exists, one will be created in the following steps. With the FindCRISPR button, the CRISPRFinder program is used to identify DRs and spacers. Often more than one DR candidate will be proposed for several reasons. One is due to the existence of several possible DRs, especially with short sequences (less than four units) and another is due to the CRISPR orientation on the genome. Indeed, when the submitted alleles are in different orientations, two DR sequences will be proposed. Therefore, the user should select the appropriate consensus DR or introduce a DR sequence. The ‘find spacer’ button leads to a page where spacers are labelled (Figure 1C) and different files can be recovered: (i) diffe (...truncated)


This is a preview of a remote PDF: https://nar.oxfordjournals.org/content/36/suppl_2/W145.full.pdf
Article home page: http://nar.oxfordjournals.org/content/36/suppl_2/W145.abstract

Ibtissem Grissa, Gilles Vergnaud, Christine Pourcel. CRISPRcompar: a website to compare clustered regularly interspaced short palindromic repeats, Nucleic Acids Research, 2008, pp. W145-W148, 36/suppl 2, DOI: 10.1093/nar/gkn228