Identifying new sex-linked genes through BAC sequencing in the dioecious plant Silene latifolia

BMC Genomics, Jul 2015

Background Silene latifolia represents one of the best-studied plant sex chromosome systems. A new approach using RNA-seq data has recently identified hundreds of new sex-linked genes in this species. However, this approach is expected to miss genes that are either not expressed or are expressed at low levels in the tissue(s) used for RNA-seq. Therefore other independent approaches are needed to discover such sex-linked genes. Results Here we used 10 well-characterized S. latifolia sex-linked genes and their homologs in Silene vulgaris, a species without sex chromosomes, to screen BAC libraries of both species. We isolated and sequenced 4 Mb of BAC clones of S. latifolia X and Y and S. vulgaris genomic regions, which yielded 59 new sex-linked genes (with S. vulgaris homologs for some of them). We assembled sequences that we believe represent the tip of the Xq arm. These sequences are clearly not pseudoautosomal, so we infer that the S. latifolia X has a single pseudoautosomal region (PAR) on the Xp arm. The estimated mean gene density in X BACs is 2.2 times lower than that in S. vulgaris BACs, agreeing with the genome size difference between these species. Gene density was estimated to be extremely low in the Y BAC clones. We compared our BAC-located genes with the sex-linked genes identified in previous RNA-seq studies, and found that about half of them (those with low expression in flower buds) were not identified as sex-linked in previous RNA-seq studies. We compiled a set of ~70 validated X/Y genes and X-hemizygous genes (without Y copies) from the literature, and used these genes to show that X-hemizygous genes have a higher probability of being undetected by the RNA-seq approach, compared with X/Y genes; we used this to estimate that about 30 % of our BAC-located genes must be X-hemizygous. The estimate is similar when we use BAC-located genes that have S. vulgaris homologs, which excludes genes that were gained by the X chromosome. Conclusions Our BAC sequencing identified 59 new sex-linked genes, and our analysis of these BAC-located genes, in combination with RNA-seq data suggests that gene losses from the S. latifolia Y chromosome could be as high as 30 %, higher than previous estimates of 10-20 %.

Article PDF cannot be displayed. You can download it here:

http://www.biomedcentral.com/content/pdf/s12864-015-1698-7.pdf

Identifying new sex-linked genes through BAC sequencing in the dioecious plant Silene latifolia

Blavet et al. BMC Genomics (2015) 16:546 DOI 10.1186/s12864-015-1698-7 RESEARCH ARTICLE Open Access Identifying new sex-linked genes through BAC sequencing in the dioecious plant Silene latifolia N Blavet1,2, H Blavet2,3, A Muyle4, J Käfer4, R Cegan3, C Deschamps5, N Zemp1, S Mousset4, S Aubourg6, R Bergero7, D Charlesworth7, R Hobza2,3, A Widmer1 and GAB Marais4* Abstract Background: Silene latifolia represents one of the best-studied plant sex chromosome systems. A new approach using RNA-seq data has recently identified hundreds of new sex-linked genes in this species. However, this approach is expected to miss genes that are either not expressed or are expressed at low levels in the tissue(s) used for RNA-seq. Therefore other independent approaches are needed to discover such sex-linked genes. Results: Here we used 10 well-characterized S. latifolia sex-linked genes and their homologs in Silene vulgaris, a species without sex chromosomes, to screen BAC libraries of both species. We isolated and sequenced 4 Mb of BAC clones of S. latifolia X and Y and S. vulgaris genomic regions, which yielded 59 new sex-linked genes (with S. vulgaris homologs for some of them). We assembled sequences that we believe represent the tip of the Xq arm. These sequences are clearly not pseudoautosomal, so we infer that the S. latifolia X has a single pseudoautosomal region (PAR) on the Xp arm. The estimated mean gene density in X BACs is 2.2 times lower than that in S. vulgaris BACs, agreeing with the genome size difference between these species. Gene density was estimated to be extremely low in the Y BAC clones. We compared our BAC-located genes with the sex-linked genes identified in previous RNA-seq studies, and found that about half of them (those with low expression in flower buds) were not identified as sex-linked in previous RNA-seq studies. We compiled a set of ~70 validated X/Y genes and X-hemizygous genes (without Y copies) from the literature, and used these genes to show that X-hemizygous genes have a higher probability of being undetected by the RNA-seq approach, compared with X/Y genes; we used this to estimate that about 30 % of our BAC-located genes must be X-hemizygous. The estimate is similar when we use BAC-located genes that have S. vulgaris homologs, which excludes genes that were gained by the X chromosome. Conclusions: Our BAC sequencing identified 59 new sex-linked genes, and our analysis of these BAC-located genes, in combination with RNA-seq data suggests that gene losses from the S. latifolia Y chromosome could be as high as 30 %, higher than previous estimates of 10-20 %. Keywords: Sex chromosomes, Sex-linked genes, Plant, BAC, RNA-seq, Gene loss, Y degeneration, Silene latifolia, Silene vulgaris Background Of only a handful of plant sex chromosome systems that have been investigated at the molecular level, the XY chromosome system of Silene latifolia is one of the beststudied [1, 2]. However, finding sex-linked genes in this species has been a slow process and is still ongoing. Approaches such as screening cDNA libraries with probes from microdissected S. latifolia Y chromosomes identified * Correspondence: 4 Laboratoire de Biométrie et Biologie Evolutive (UMR 5558), CNRS/Université Lyon 1, Villeurbanne, France Full list of author information is available at the end of the article only a few sex-linked genes (reviewed in [3]). Segregation analysis of intron variants and SNPs within plant families revealed more sex-linked genes (e.g. [4, 5]). Altogether, these approaches yielded about 30 validated S. latifolia sex-linked genes. Recently, however, three studies used RNA-seq to identify hundreds of S. latifolia sex-linked genes, either using segregation patterns within families [6, 7] or male and female full siblings from an inbred population [8]. Sex-linked genes were identified either by following allele transmission from parents to their progeny (in the two studies using families, [6, 7]), or by searching for © 2015 Blavet et al. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http:// creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Blavet et al. BMC Genomics (2015) 16:546 SNPs homozygous in females and heterozygous in males, indicating Y-linkage [8]. As no S. latifolia reference genome is available, these searches started with either a de novo assembled reference transcriptome using the S. latifolia RNA-seq data [7, 8] or using 454 EST data from S. vulgaris, a close relative without sex chromosomes [6, 9], to map the S. latifolia reads and perform SNP-calling. Both approaches are subject to errors, especially when sex-linkage of a contig is inferred from the segregation pattern of only a single SNP, so the inferences were assessed by checking for complete sex-linkage of some of the inferred sex-linked genes, using PCR on sets of unrelated males and females [6, 7]. Further tests were done to check whether “tester sets” of well-validated sex-linked and autosomal genes (see above) were correctly assigned [6–8]. The results were encouraging, with most genes tested being correctly assigned. However, only a few newly inferred genes (~10 in each study) were checked experimentally, and the tester sets included only 10–20 sexlinked and 0-10 autosomal genes. Moreover, the RNA-seq studies focused on RNA from only one tissue (flower buds) and any sex-linked genes not expressed in flower buds, or expressed at low levels, must be missed [6–8]. The number of sex-linked genes in S. latifolia is therefore not yet accurately known. An alternative approach to discovering new sex-linked genes is to sequence BAC clones from the sex chromosomes. A handful of BACs from the S. latifolia X and Y chromosomes have already been sequenced (e.g. [10, 11]), and they yielded few new sexlinked genes. To improve the yield, we screened a BAC library with probes from validated X-linked or Y-linked genes of S. latifolia, which establishes sex-linkage of all genes found in the BAC sequences. Identifying both Xlinked and Y-linked genes is important for estimating the proportion of X-linked genes that have lost their Y counterparts, indicating Y genetic degeneration of this plant sex chromosome system. Sequencing BACs should help identify genes with low expression levels, some of which were probably missed by previous studies, because most sex-linked genes identified so far in S. latifolia come from cDNA, ESTs or RNA-seq data, which will be enriched for highly expressed genes. Sequencing the complete S. latifolia sex chromosomes using BACs would be extremely costly as the X is 400 Mb, and the Y 550 Mb. However, BAC sequencing to obtain seq (...truncated)


This is a preview of a remote PDF: http://www.biomedcentral.com/content/pdf/s12864-015-1698-7.pdf
Article home page: http://www.biomedcentral.com/1471-2164/16/546

N Blavet, H Blavet, A Muyle, J Käfer, R Cegan, C Deschamps, N Zemp, S Mousset, S Aubourg, R Bergero, D Charlesworth, R Hobza, A Widmer, GAB Marais. Identifying new sex-linked genes through BAC sequencing in the dioecious plant Silene latifolia, BMC Genomics, 2015, pp. 546, 16, DOI: 10.1186/s12864-015-1698-7