Frequent loss of lineages and deficient duplications accounted for low copy number of disease resistance genes in Cucurbitaceae
BMC Genomics
Frequent loss of lineages and deficient duplications accounted for low copy number of disease resistance genes in Cucurbitaceae
Xiao Lin 0
Yu Zhang 0
Hanhui Kuang 0
Jiongjiong Chen 0
0 Key Laboratory of Horticulture Biology, Ministry of Education, and Department of Vegetable Crops, College of Horticulture and Forestry, Huazhong Agricultural University , Wuhan 430070 , P.R. China
Background: The sequenced genomes of cucumber, melon and watermelon have relatively few R-genes, with 70, 75 and 55 copies only, respectively. The mechanism for low copy number of R-genes in Cucurbitaceae genomes remains unknown. Results: Manual annotation of R-genes in the sequenced genomes of Cucurbitaceae species showed that approximately half of them are pseudogenes. Comparative analysis of R-genes showed frequent loss of R-gene loci in different Cucurbitaceae species. Phylogenetic analysis, data mining and PCR cloning using degenerate primers indicated that Cucurbitaceae has limited number of R-gene lineages (subfamilies). Comparison between R-genes from Cucurbitaceae and those from poplar and soybean suggested frequent loss of R-gene lineages in Cucurbitaceae. Furthermore, the average number of R-genes per lineage in Cucurbitaceae species is approximately 1/3 that in soybean or poplar. Therefore, both loss of lineages and deficient duplications in extant lineages accounted for the low copy number of R-genes in Cucurbitaceae. No extensive chimeras of R-genes were found in any of the sequenced Cucurbitaceae genomes. Nevertheless, one lineage of R-genes from Trichosanthes kirilowii, a wild Cucurbitaceae species, exhibits chimeric structures caused by gene conversions, and may contain a large number of distinct R-genes in natural populations. Conclusions: Cucurbitaceae species have limited number of R-gene lineages and each genome harbors relatively few R-genes. The scarcity of R-genes in Cucurbitaceae species was due to frequent loss of R-gene lineages and infrequent duplications in extant lineages. The evolutionary mechanisms for large variation of copy number of R-genes in different plant species were discussed.
R-genes; Cucurbitaceae; Copy number; Evolution; Sequence exchange
-
Background
The vast majority of the cloned disease resistance genes
from plants encode nucleotide-binding site (NBS) and
leucine-rich repeat (LRR) domains. The NBS-LRR
proteins are often referred to as R proteins and their
encoding genes as R-genes. R proteins can be further divided
into two subclasses, the TIR (toll, interleukin
receptorlike) subclass and the non-TIR subclass [1]. The TIR
subclass proteins have the TIR domain in their N
terminals, while most R proteins from the non-TIR subclass
have a coiled-coil (CC) domain instead.
The R-genes in plants belong to a large gene family,
and R-genes tend to be clustered in genomes. For
instance, approximately 66% of the 149 R-genes in
Arabidopsis thaliana (Col-0) and 76% of the 623 R-genes
in rice (Oryza sativa cultivar Nipponbare) are located in
clusters [2,3]. Many R-genes within a cluster belong to
the same subfamily and may have had frequent sequence
exchanges (either by gene conversion or recombination)
resulting in chimeric structures [4-16]. Those chimeras,
termed Type I R-genes, are highly diverse in different
genotypes of a species, and consequently, a large number
of R-genes with distinct sequences are predicted in a
population/species [12,13,17]. Those chimeras were
generated either by unequal crossovers or gene conversions.
The frequent sequence exchanges among some Type I
R-genes did not homogenize their coding sequences (i.e.
no concerted evolution), though their intron sequences
may be homogenized [12]. The lack of concerted
evolution for the coding sequences of R-genes was likely due
to diversifying selection after sequence exchanges [12].
In contrast to the extensively chimeric R-genes, other
R-genes (termed Type II) evolved independently and did
not have sequence exchanges with homologues. The
sequences of Type II R-genes, when present, are highly
conserved in different genotypes of the same or closely
related species. Surprisingly, these highly conserved
Rgenes are frequently absent in some genotypes, showing
presence/absence (P/A) polymorphism [3,12,17-20]. For
example, 124 R-genes in two rice cultivars 9311 and
Nipponbare exhibit P/A polymorphism [3]. In the
absence haplotypes, the entire Type II R-gene sequence is
missing. Balancing selection may have played an
important role in maintaining such P/A polymorphism [20,21].
The mechanism for such balancing selection remains
poorly understood, but it is likely that the presence of
some R-genes may have fitness cost such as low viability,
low seed productions, etc. [22].
The number of R-genes in different plant genomes
varies dramatically. Some genomes, such as the genomes of
apple and wheat, contain approximately 1,000 R-genes
[23,24]. In contrast, less than 100 R-genes are present in
the sequenced genomes of papaya, cucumber, watermelon
and melon, respectively [25-28]. It remains unclear why
the number of R-genes varies considerably in different
genomes while the total number of coding genes in a
genome is relatively stable. Interestingly, the number of
R-genes in a genome is significantly correlated with the
number of LRR-LRK encoding genes, which may also be
involved in disease resistance [29]. The identification and
annotation of R-genes in a genome are challenging, simply
because they are highly diverse and a considerable
proportion of them are pseudogenes [2,30,31]. Large deletions
(i.e. partial genes), frameshift indels or nonsense point
mutations of R-genes make annotations using computer
programs problematic. Consequently, many (the vast
majority, in some cases) R-genes may be mis-annotated by
gene prediction programs, and manual annotation is
recommended to correct the errors [2].
The Cucurbitaceae family includes several agriculturally
important crops such as melon (Cucumis melo), cucumber
(Cucumis sativus), pumpkin (Cucurbita moschata) and
watermelon (Citrullus lanatus). Disease is one of the main
factors affecting their yields and forcing massive use of
chemical sprays. Only one R-gene, Fom-2 in melon, has
been cloned from the Cucurbitaceae speies, while a
candidate gene Ccu encoding resistance against cucumber scab
was identified [32,33]. Recently, genomes of cucumber,
melon and watermelon have been sequenced [26-28].
Only 61, 81 (R-genes plus genes encoding TIR only) and
44 R-genes were reported in the genomes of cucumber
(9930), melon and watermelon, respectively. Low copy
number of R-genes was also found in cucumber cultivar
Gy14 [34]. The genetic mechanisms for such low copy
number of R-genes in Cucurbitaceae species remain
unclear. The R-genes from Cucurbitaceae genomes (except
watermelon) were annotated using computer programs
and were not verified manually. Thought the distribution
of R-genes on cucumber chromosomes, R-gene sequences
from other Cucurb (...truncated)