Identifying new sex-linked genes through BAC sequencing in the dioecious plant Silene latifolia
Blavet et al. BMC Genomics (2015) 16:546
DOI 10.1186/s12864-015-1698-7
RESEARCH ARTICLE
Open Access
Identifying new sex-linked genes through BAC
sequencing in the dioecious plant Silene latifolia
N Blavet1,2, H Blavet2,3, A Muyle4, J Käfer4, R Cegan3, C Deschamps5, N Zemp1, S Mousset4, S Aubourg6, R Bergero7,
D Charlesworth7, R Hobza2,3, A Widmer1 and GAB Marais4*
Abstract
Background: Silene latifolia represents one of the best-studied plant sex chromosome systems. A new approach using
RNA-seq data has recently identified hundreds of new sex-linked genes in this species. However, this approach is
expected to miss genes that are either not expressed or are expressed at low levels in the tissue(s) used for RNA-seq.
Therefore other independent approaches are needed to discover such sex-linked genes.
Results: Here we used 10 well-characterized S. latifolia sex-linked genes and their homologs in Silene vulgaris, a species
without sex chromosomes, to screen BAC libraries of both species. We isolated and sequenced 4 Mb of BAC clones of S.
latifolia X and Y and S. vulgaris genomic regions, which yielded 59 new sex-linked genes (with S. vulgaris homologs for
some of them). We assembled sequences that we believe represent the tip of the Xq arm. These sequences are clearly
not pseudoautosomal, so we infer that the S. latifolia X has a single pseudoautosomal region (PAR) on the Xp arm. The
estimated mean gene density in X BACs is 2.2 times lower than that in S. vulgaris BACs, agreeing with the genome size
difference between these species. Gene density was estimated to be extremely low in the Y BAC clones. We compared
our BAC-located genes with the sex-linked genes identified in previous RNA-seq studies, and found that about half of
them (those with low expression in flower buds) were not identified as sex-linked in previous RNA-seq studies. We
compiled a set of ~70 validated X/Y genes and X-hemizygous genes (without Y copies) from the literature, and used
these genes to show that X-hemizygous genes have a higher probability of being undetected by the RNA-seq approach,
compared with X/Y genes; we used this to estimate that about 30 % of our BAC-located genes must be X-hemizygous.
The estimate is similar when we use BAC-located genes that have S. vulgaris homologs, which excludes genes that were
gained by the X chromosome.
Conclusions: Our BAC sequencing identified 59 new sex-linked genes, and our analysis of these BAC-located genes, in
combination with RNA-seq data suggests that gene losses from the S. latifolia Y chromosome could be as high as 30 %,
higher than previous estimates of 10-20 %.
Keywords: Sex chromosomes, Sex-linked genes, Plant, BAC, RNA-seq, Gene loss, Y degeneration, Silene latifolia, Silene
vulgaris
Background
Of only a handful of plant sex chromosome systems that
have been investigated at the molecular level, the XY
chromosome system of Silene latifolia is one of the beststudied [1, 2]. However, finding sex-linked genes in this
species has been a slow process and is still ongoing. Approaches such as screening cDNA libraries with probes
from microdissected S. latifolia Y chromosomes identified
* Correspondence:
4
Laboratoire de Biométrie et Biologie Evolutive (UMR 5558), CNRS/Université
Lyon 1, Villeurbanne, France
Full list of author information is available at the end of the article
only a few sex-linked genes (reviewed in [3]). Segregation
analysis of intron variants and SNPs within plant families
revealed more sex-linked genes (e.g. [4, 5]). Altogether,
these approaches yielded about 30 validated S. latifolia
sex-linked genes.
Recently, however, three studies used RNA-seq to
identify hundreds of S. latifolia sex-linked genes, either
using segregation patterns within families [6, 7] or male
and female full siblings from an inbred population [8].
Sex-linked genes were identified either by following allele transmission from parents to their progeny (in the
two studies using families, [6, 7]), or by searching for
© 2015 Blavet et al. This is an Open Access article distributed under the terms of the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium,
provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://
creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Blavet et al. BMC Genomics (2015) 16:546
SNPs homozygous in females and heterozygous in males,
indicating Y-linkage [8]. As no S. latifolia reference genome is available, these searches started with either a de
novo assembled reference transcriptome using the S. latifolia RNA-seq data [7, 8] or using 454 EST data from S.
vulgaris, a close relative without sex chromosomes [6, 9],
to map the S. latifolia reads and perform SNP-calling.
Both approaches are subject to errors, especially when
sex-linkage of a contig is inferred from the segregation
pattern of only a single SNP, so the inferences were
assessed by checking for complete sex-linkage of some of
the inferred sex-linked genes, using PCR on sets of unrelated males and females [6, 7]. Further tests were done to
check whether “tester sets” of well-validated sex-linked
and autosomal genes (see above) were correctly assigned
[6–8]. The results were encouraging, with most genes
tested being correctly assigned. However, only a few newly
inferred genes (~10 in each study) were checked experimentally, and the tester sets included only 10–20 sexlinked and 0-10 autosomal genes. Moreover, the RNA-seq
studies focused on RNA from only one tissue (flower
buds) and any sex-linked genes not expressed in flower
buds, or expressed at low levels, must be missed [6–8].
The number of sex-linked genes in S. latifolia is therefore
not yet accurately known. An alternative approach to discovering new sex-linked genes is to sequence BAC clones
from the sex chromosomes. A handful of BACs from the
S. latifolia X and Y chromosomes have already been
sequenced (e.g. [10, 11]), and they yielded few new sexlinked genes. To improve the yield, we screened a BAC
library with probes from validated X-linked or Y-linked
genes of S. latifolia, which establishes sex-linkage of all
genes found in the BAC sequences. Identifying both Xlinked and Y-linked genes is important for estimating the
proportion of X-linked genes that have lost their Y counterparts, indicating Y genetic degeneration of this plant sex
chromosome system. Sequencing BACs should help identify
genes with low expression levels, some of which were probably missed by previous studies, because most sex-linked
genes identified so far in S. latifolia come from cDNA, ESTs
or RNA-seq data, which will be enriched for highly
expressed genes. Sequencing the complete S. latifolia sex
chromosomes using BACs would be extremely costly as the
X is 400 Mb, and the Y 550 Mb. However, BAC sequencing
to obtain seq (...truncated)