Quality control of the sheep bacterial artificial chromosome library, CHORI-243

BMC Research Notes, Dec 2010

Background The sheep CHORI-243 bacterial artificial chromosome (BAC) library is being used in the construction of the virtual sheep genome, the sequencing and construction of the actual sheep genome assembly and as a source of DNA for regions of the genome of biological interest. The objective of our study is to assess the integrity of the clones and plates which make up the CHORI-243 library using the virtual sheep genome. Findings A series of analyses were undertaken based on the mapping the sheep BAC-end sequences (BESs) to the virtual sheep genome. Overall, very few plate specific biases were identified, with only three of the 528 plates in the library significantly affected. The analysis of the number of tail-to-tail (concordant) BACs on the plates identified a number of plates with lower than average numbers of such BACs. For plates 198 and 213 a partial swap of the BESs determined with one of the two primers appear to have occurred. A third plate, 341, also with a significant deficit in tail-to-tail BACs, appeared to contain a substantial number of sequences determined from contaminating eubacterial 16 S rRNA DNA. Additionally a small number of eubacterial 16 S rRNA DNA sequences were present on two other plates, 111 and 338, in the library. Conclusions The comparative genomic approach can be used to assess BAC library integrity in the absence of fingerprinting. The sequences of the sheep CHORI-243 library BACs have high integrity, especially with the corrections detailed above. The library represents a high quality resource for use by the sheep genomics community.

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1186%2F1756-0500-3-334.pdf

Quality control of the sheep bacterial artificial chromosome library, CHORI-243

Ratnakumar et al. BMC Research Notes 2010, 3:334 http://www.biomedcentral.com/1756-0500/3/334 SHORT REPORT Open Access Quality control of the sheep bacterial artificial chromosome library, CHORI-243 Abhirami Ratnakumar1,2, Ewen F Kirkness3, Brian P Dalrymple1* Abstract Background: The sheep CHORI-243 bacterial artificial chromosome (BAC) library is being used in the construction of the virtual sheep genome, the sequencing and construction of the actual sheep genome assembly and as a source of DNA for regions of the genome of biological interest. The objective of our study is to assess the integrity of the clones and plates which make up the CHORI-243 library using the virtual sheep genome. Findings: A series of analyses were undertaken based on the mapping the sheep BAC-end sequences (BESs) to the virtual sheep genome. Overall, very few plate specific biases were identified, with only three of the 528 plates in the library significantly affected. The analysis of the number of tail-to-tail (concordant) BACs on the plates identified a number of plates with lower than average numbers of such BACs. For plates 198 and 213 a partial swap of the BESs determined with one of the two primers appear to have occurred. A third plate, 341, also with a significant deficit in tail-to-tail BACs, appeared to contain a substantial number of sequences determined from contaminating eubacterial 16 S rRNA DNA. Additionally a small number of eubacterial 16 S rRNA DNA sequences were present on two other plates, 111 and 338, in the library. Conclusions: The comparative genomic approach can be used to assess BAC library integrity in the absence of fingerprinting. The sequences of the sheep CHORI-243 library BACs have high integrity, especially with the corrections detailed above. The library represents a high quality resource for use by the sheep genomics community. Findings We have recently demonstrated for the bovine CHORI240 BAC library that it is possible to identify BACs with confused identities using three independent datasets (such as genome sequences, BESs and BAC fingerprints) [1]. BACs whose identities are not consistent across the three datasets are likely to have been confused, missed, duplicated or misassigned somewhere in the generation, copying or use of the library. The sheep CHORI-243 BAC library [2] will be important for the assembly of the sheep genome, particularly for the reference genome assembly based on the animal used to construct the BAC library. Hence it is extremely important to determine the integrity of the BAC-end sequences, the clones and the plates in the library. In contrast to the bovine CHORI-240 library, for the sheep BACs three or more independent datasets are not available. Here, we have * Correspondence: 1 CSIRO Livestock Industries, 306 Carmody Road, St. Lucia, QLD 4067, Australia Full list of author information is available at the end of the article addressed the question, is it possible to identify BACs with issues in a library where only the BESs themselves have been determined? To do this we used the virtual genome sequence, which is a reordering of the bovine genome sequence into the predicted order of the sequence in sheep using the integrated mapping of sheep BACs to a number of genome sequences [2,3]. Plate biases identified by sequence alignments The proportion of the total number of BESs from each CHORI-243 plate positioned on the virtual sheep genome assembly v2.0 [3] was calculated (Figure 1A). In general a very consistent proportion of BESs from each plate was positioned, indicating that there were few plate biases identified using the alignment process. A more sensitive measure of integrity is to look at the number of BACs mapped to the virtual sheep genome with their BESs in a tail-to-tail arrangement, i.e. concordant (the organisation of the original sheep sequence in the sheep genome). If the integrity of the identity of © 2010 Dalrymple et al; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Ratnakumar et al. BMC Research Notes 2010, 3:334 http://www.biomedcentral.com/1756-0500/3/334 Page 2 of 4 B BESs in TT BACS/ total BESs 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 C single BESs / total BESs 0.7 0.6 0.5 0.4 0.3 0.9 0.8 0.7 0.6 0.5 0.4 7 184 341 394 303 497 0.2 0.1 0 50 7 0.2 198 213 0.1 341 497 0 50 100 150 200 250 300 350 400 450 500 550 plate num ber 1 0 0.8 0 plate num ber 0.3 1 0.9 50 100 150 200 250 300 350 400 450 500 550 100 150 200 250 300 350 400 450 500 550 plate num ber Two BESs in TT BACS / total BESs BESs positioned / total BESs A D 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 341 0.2 0.1 198 213 0 0 50 100 150 200 250 300 350 400 450 500 550 plate num ber Figure 1 Proportion of the total number of sheep BESs for each plate of the BAC library. A. positioned on the virtual sheep genome. B. positioned in tail-to-tail (TT) BACs. C. in BACs with only one end positioned on the virtual sheep genome. D. Proportion of the total number of sheep BACs with two end sequences and tail-to-tail organisation for each plate of the BAC library. reads is maintained the proportion of BACs with tail-totail paired end reads should be roughly constant across all plates. On the basis of the BES positions in the bovine genome the BACs were grouped into concordant BACs (tail-to-tail), discordant BACs (i.e. tail-to-tail outsize, tail-to-head etc.). A plot of the proportion of tailto-tail BACs versus the total number of BACs per plate that were positioned, revealed some plates (7, 213, 198, 341 and 497) likely to have experienced problems in the end sequencing (Figure 1B). Plates 213 and 198 had no tail-to-tail BACs at all. However, plate 7 also had a large number of BACs with only one end sequenced (Figure 1C). Plotting the proportion of tail-to-tail BACs to BACs with sequences from both ends of the BAC (Figure 1D) identified that three plates, 198, 213 and 341 were definitely problematic. Further analyses revealed that some BAC-end sequences from BACs on plate 198 appear to have been derived from BACs on plate 213 and vice versa. However, a straight swap of the SP6 or T7 BESs between plates 198 and 213 only increased the number of tail-to-tail BACs to 67 for each of the plates, which is still well below the expected value of 183 BACs based on the dataset average. Thus the problem appeared to be more complicated than a straight swap of BESs between the two plates during sequencing. Plate 341 was also investigated in more detail (see below). BES and BAC overlaps During the assembly of genomes the number of links between different segments of DNA in contigs and scaffolds is frequently a key factor in ordering and orientating the segments. It is important that (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1186%2F1756-0500-3-334.pdf
Article home page: http://link.springer.com/article/10.1186/1756-0500-3-334

Abhirami Ratnakumar, Ewen F Kirkness, Brian P Dalrymple. Quality control of the sheep bacterial artificial chromosome library, CHORI-243, BMC Research Notes, 2010, pp. 334, Volume 3, Issue 1, DOI: 10.1186/1756-0500-3-334