Plant NBS-LRR proteins: adaptable guards
Plant NBS-LRR proteins: adaptable guards Leah McHale, Xiaoping Tan, Patrice Koehl and Richard W Michelmore
Correspondence: Richard W Michelmore. Email: 0
0 Address: The Genome Center, University of California , Davis, CA 95616 , USA
The majority of disease resistance genes in plants encode nucleotide-binding site leucine-rich repeat (NBS-LRR) proteins. This large family is encoded by hundreds of diverse genes per genome and can be subdivided into the functionally distinct TIR-domain-containing (TNL) and CC-domain-containing (CNL) subfamilies. Their precise role in recognition is unknown; however, they are thought to monitor the status of plant proteins that are targeted by pathogen effectors.
-
Plant NBS-LRR proteins are similar in sequence to members
of the mammalian nucleotide-binding oligomerization
domain (NOD)-LRR protein family (also called CARD,
transcription enhancer, R (purine)-binding, pyrin, lots of leucine
repeats (CATERPILLER) proteins), which function in
inflammatory and immune responses [6]. But although
mammalian NOD-LRR proteins have the same tripartite
domain organization as plant NBS-LRR proteins, including a
nucleotide-binding domain and a LRR domain, the
functional similarities between NBS-LRR and mammalian NOD
proteins are probably the result of convergent evolution [7].
There are no NOD-related proteins in Caenorhabditis
elegans or Drosophila melanogaster and the downstream
partners of the two families differ [7,8]. The human NOD
protein apoptotic protease activating factor 1 (APAF-1) has
an NBS domain with greater protein-sequence similarity to
plant NBS-LRR proteins than to other mammalian NOD
proteins; however, it shares neither the amino-terminal nor
the carboxy-terminal LRR domains characteristic of plant
NBS-LRR proteins.
Evolution and genome organization
Plant NBS-LRR proteins are numerous and ancient in
origin. They are encoded by one of the largest gene families
known in plants. There are approximately 150
NBS-LRRencoding genes in Arabidopsis thaliana, over 400 in Oryza
sativa [3,9,10], and probably considerably more in larger
plant genomes that have yet to be fully sequenced. Many
NBS-encoding sequences have now been amplified from a
diverse array of plant species using PCR with degenerate
primers based on conserved sequences within the NBS
domain and there are currently over 1,600 NBS sequences in
public databases (Additional data file 1). They are found in
non-vascular plants and gymnosperms as well as in
angiosperms; orthologous relationships are difficult to
determine, however, owing to lineage-specific gene
duplications and losses [11,12]. In several lineages,
NBS-LRR-encoding genes have become amplified, resulting in family-specific
subfamilies (Figure 2; Additional data file 2) [13]. Of the 150
Bs4, L6, N protein,
C RAC1, RPP5, RPS4
and Y-1
I2, Mi, Mla, Prf, RPP8,
RPP13, RPS2, RPS5
and Rx
Bs2, RGC2 and RPM1
NBS-LRR sequences in Arabidopsis, 62 have NBS regions
more similar to each other than to any other non-Brassica
sequences (Figure 2; Additional data file 2). Different
subfamilies have been amplified in the legumes (which includes
beans), the Solanaceae (which includes tomato and potato),
and the Asteraceae (which includes sunflower and lettuce)
[13-15]. The spectrum of NBS-LRR proteins present in one
species is not therefore characteristic of the diversity of
NBSLRR proteins in other plant families.
NBS-LRR-encoding genes are frequently clustered in the
genome, the result of both segmental and tandem
duplications [3,10,16,17]. There can be wide intraspecific variation
in copy number because of unequal crossing-over within
clusters [18,19]. NBS-LRR-encoding genes have high levels
of inter- and intraspecific variation but not high rates of
mutation or recombination [19]. Variation is generated by
normal genetic mechanisms, including unequal
crossingover, sequence exchange, and gene conversion, rather than
genetic events particular to NBS-LRR-encoding genes
[3,19-21].
The rate of evolution of NBS-LRR-encoding genes can be
rapid or slow, even within an individual cluster of similar
sequences. For example, the major cluster of
NBS-LRRencoding genes in lettuce includes genes with two patterns of
evolution [19]: type I genes evolve rapidly with frequent gene
conversions between them, whereas type II genes evolve
slowly with rare gene conversion events between clades. This
heterogeneous rate of evolution is consistent with a
birth-anddeath model of R gene evolution, in which gene duplication
and unequal crossing-over can be followed by
densitydependent purifying selection acting on the haplotype,
resulting in varying numbers of semi-independently
evolving groups of R genes [19,22].
The impact of selection on the different domains of
individual NBS-LRR-encoding genes is also heterogeneous [19].
The NBS domain seems to be subject to purifying selection
but not to frequent gene-conversion events, whereas the
LRR region tends to be highly variable. Diversifying
selection, as indicated by significantly elevated ratios of
nonsynonymous to synonymous nucleotide substitutions, has
maintained variation in the solvent-exposed residues of the
-sheets of the LRR domain (see below) [19,23]. Unequal
crossing-over and gene conversion have generated variation
in the number and position of LRRs, and in-frame insertions
and/or deletions in the regions between the -sheets have
probably changed the orientation of individual -sheets.
There are, on average, 14 LRRs per protein and often 5 to 10
sequence variants for each repeat; therefore, even within
Arabidopsis, there is the potential for well over 9 x 1011
variants, which emphasizes the highly variable nature of the
putative binding surface of these proteins.
There are two major subfamilies of plant NBS-LRR proteins,
defined by the presence of Toll/interleukin-1 receptor (TIR)
or coiled-coil (CC) motifs in the amino-terminal domain
(Figure 1). Although TIR-NBS-LRR proteins (TNLs) and
CCNBS-LRR proteins (CNLs) are both involved in pathogen
recognition, the two subfamilies are distinct both in
sequence and in signaling pathways (see below) and cluster
9, 12, 5, 2, 1, 1, 1, 1
Amaranthaceae Apiaceae
Asteraceae
Brassicaceae
Caricaceae
Cucurbitaceae
Cupressaceae
Euphorbiaceae Fabaceae Funariaceae Lamiaceae
21 5
7, 2
50
5, 222
Solanaceae Vitaceae Multiple families
77, 16, 10, 8, 7, 3, 1, 1
separately in phylogenetic analyses using their NBS domains
(see Additional data file 2) [24,25]. TNLs are completely
absent from cereal species, which suggests that the early
angiosperm ancestors had few TNLs and that these were lost
in the cereal lineage. The presence or absence of TNLs in
basal monocots is not currently known. CNLs from monocots
and dicots cluster together, indicating that angiosperm
ancestors had multiple CNLs (Figure 2) [26].
There are also 58 proteins in Arabidopsis that are related to
the TNL or CNL subfamilies but lack the full complement of
domains [3,27]. These include (...truncated)