Mapping-by-sequencing accelerates forward genetics in barley
Mascher et al. Genome Biology
Mapping-by-sequencing accelerates forward genetics in barley
Martin Mascher 0
Matthias Jost 0
Joel-Elias Kuon
Axel Himmelbach
Axel Afalg
Sebastian Beier
Uwe Scholz
Andreas Graner
Nils Stein
0 Equal contributors Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), OT Gatersleben , Corrensstrae 3, 06466 Stadt Seeland , Germany
Mapping-by-sequencing has emerged as a powerful technique for genetic mapping in several plant and animal species. As this resequencing-based method requires a reference genome, its application to complex plant genomes with incomplete and fragmented sequence resources remains challenging. We perform exome sequencing of phenotypic bulks of a mapping population of barley segregating for a mutant phenotype that increases the rate of leaf initiation. Read depth analysis identifies a candidate gene, which is confirmed by the analysis of independent mutant alleles. Our method illustrates how the genomic resources of barley together with exome resequencing can underpin mapping-by-sequencing.
-
Background
The recent profound transformation of molecular biology
by next-generation sequencing (NGS) technologies [1]
and the ready availability of reference genome sequences
[2] has enriched the plant geneticists toolbox with what
Schneeberger and Weigel named fast-forward genetics
[3]. Combining classical bulked-segregant analysis [4] with
aligning NGS read data to a reference genome has made
gene cloning essentially a single-step computational
procedure once a mapping population has been established
[5]. Within a few days time, mapping intervals can be
delineated in silico and mined for likely candidate genes,
deprecating marker saturation, and physical mapping of
the target interval. Since its original implementation as
ShoreMap in an F2 population of Arabidopsis thaliana,
mapping-by-sequencing has been extended to other
population types such as isogenic backcross populations [6,7]
as well as to other plant and animal species such as rice
[8], maize [9], mouse, and zebrafish [10].
All successful attempts at mapping-by-sequencing in
these species could take advantage of high-quality
mapbased reference sequences. A reference genome embeds
almost all genes of a species in a genomic context, a
crucial prerequisite for mapping-by-sequencing, as
sequencing of phenotypic bulks provides only allele frequencies
at variant positions, but no genotypic data that could be
used to construct a genetic map de novo to infer marker
order. How this order can be derived in the absence of a
reference genome and how rapid NGS-based gene
isolation may be implemented in species for which only draft
genome assemblies are available is not obvious. Galvao
et al. [11] have proposed the collinear gene order in
related species as a proxy for gene order in species without
a reference genomes, but have also noted that this
synteny-based approach may adversely affect mapping
resolution. A novel bioinformatical procedure to find
causal mutations by whole genome sequencing without
using positional information has been applied to find
causal variants in plant species with small genomes [12].
In addition to its importance for agriculture, barley
(Hordeum vulgare L.) has been a model organism of
genetics throughout the 20th century and boasts excellent
resources for forward genetics. A large number of barley
mutants had been created from the 1940s to the 1970s
when mutation breeding programs flourished [13-16].
These mutant lines have been classified phenotypically
and are nowadays maintained and distributed by seed
banks. To further support the utilization of these
resources in research and breeding, 881 original mutants
have been backcrossed to cultivar (cv.) Bowman as a
recurrent parent to obtain mutant alleles in a nearly isogenic
background. Array-based genotyping of these
introgression lines confirmed and broadly delimited introgression
intervals [17]. This legacy of half a century of meticulous
research has been recently complemented by several
mutant populations [18,19] that were obtained in a systemic
way via mutagenesis with ethyl methanesulfonate (EMS)
to empower reverse genetics.
In this regard, the mutants of barley have been
instrumental in confirming candidate genes discovered
through mapping in bi-parental populations [20] or
association panels [21]. However, the full exploitation of
the allelic diversity captured in these resources for basic
research and crop improvement has been impeded by
the lack of a reference genome sequence of barley. The
major obstacles in assembling the barley genome are its
sheer size (5 Gb) and its high content of repetitive DNA
(80%), which pose a heavy sequencing load and put a
challenge for current assembly algorithms [22]. Boosted
by the enormous increase in sequencing throughput,
extensive sequence datasets have accumulated recently and
have been integrated with a genome-wide physical map
and high-density genetic maps [23]. A large fraction of
low-copy portion of the barley genome is now
represented by contigs of a whole-genome shotgun assembly
which are positioned with a resolution of approximately
3 cM [24]. Moreover, an exome capture assay designed
on the basis of the annotated sequence assembly has
made approximately 60 Mb of mRNA-coding sequence
accessible to cost-efficient high-throughput resequencing
[25].
To date, the complex sequence framework of barley
has not been used as a backbone for
mapping-bysequencing. Though the hopes are high, concerns
remain that the fragmentary and incompletely ordered
structure of the sequence assembly and the only partial
representation of the gene complement may stall
fastforward genetics. Leveraging the physically and
genetically anchored sequence assembly, exome sequencing and
the extensive mutant collections available to the barley
research community, we put mapping-by-sequencing to
the test in barley and were able to rapidly identify a gene
underlying the many-noded dwarf (mnd) phenotype.
Results
mnd mutant phenotype
The original mnd mutant was generated by X-ray
mutagenesis at our institute in the 1950s [13]. The most
conspicuous characteristic of mnd plants is their shortened
plastochron, that is, a faster rate of leaf initiation.
Mutants have on average two times more leaves than
wildtype plants as a result of a faster emergence of leaves
(Figure 1). Moreover, culm internode lengths are
decreased in the mutant. Despite the larger number of
internode (eight to nine in the mutant versus four to five
in the wildtype), plant height is reduced by about one
third under field conditions, but not in the greenhouse
(Figure 1d). Apart from spacing, also the shape of leaves
is altered in the mutant: leaves are narrower and more
erect compared to the wildtype. Additional
characteristics of mnd are an increased number of tillers (vegetative
shoot branches arising from lateral meristems) and
shorter spikes (Figure 1b; Additional file 1: Figure S1).
Allele freq (...truncated)