The salinity tolerant poplar database (STPD): a comprehensive database for studying tree salt-tolerant adaption and poplar genomics
Ma et al. BMC Genomics
The salinity tolerant poplar database (STPD): a comprehensive database for studying tree salt-tolerant adaption and poplar genomics
Yazhen Ma 0
Ting Xu 0
Dongshi Wan 0
Tao Ma
Sheng Shi
Jianquan Liu
Quanjun Hu
0 Equal contributors Molecular Ecology Group, State Key Laboratory of Grassland and Agro-Ecosystems, School of Life Sciences, Lanzhou University , Lanzhou 730000Gansu , China
Background: Soil salinity is a significant factor that impairs plant growth and agricultural productivity, and numerous efforts are underway to enhance salt tolerance of economically important plants. Populus species are widely cultivated for diverse uses. Especially, they grow in different habitats, from salty soil to mesophytic environment, and are therefore used as a model genus for elucidating physiological and molecular mechanisms of stress tolerance in woody plants. Description: The Salinity Tolerant Poplar Database (STPD) is an integrative database for salt-tolerant poplar genome biology. Currently the STPD contains Populus euphratica genome and its related genetic resources. P. euphratica, with a preference of the salty habitats, has become a valuable genetic resource for the exploitation of tolerance characteristics in trees. This database contains curated data including genomic sequence, genes and gene functional information, non-coding RNA sequences, transposable elements, simple sequence repeats and single nucleotide polymorphisms information of P. euphratica, gene expression data between P. euphratica and Populus tomentosa, and whole-genome alignments between Populus trichocarpa, P. euphratica and Salix suchowensis. The STPD provides useful searching and data mining tools, including GBrowse genome browser, BLAST servers and genome alignments viewer, which can be used to browse genome regions, identify similar sequences and visualize genome alignments. Datasets within the STPD can also be downloaded to perform local searches. Conclusions: A new Salinity Tolerant Poplar Database has been developed to assist studies of salt tolerance in trees and poplar genomics. The database will be continuously updated to incorporate new genome-wide data of related poplar species. This database will serve as an infrastructure for researches on the molecular function of genes, comparative genomics, and evolution in closely related species as well as promote advances in molecular breeding within Populus. The STPD can be accessed at http://me.lzu.edu.cn/stpd/.
Salt-tolerant; Populus; Genome database; Tree adaptation; Abiotic stress
-
Background
Salinity is a main environmental constraint that renders
fields unproductive. It is also one of the most severe
abiotic stress factors affecting plant growth and agricultural
production worldwide [1]. To cope with this intractable
problem, many researches have been undertaken to
explore the physiological and molecular mechanisms of
plants that naturally display high salt resistance or use
plant breeding and biotechnological approaches to
enhance the stress resistance of salt-sensitive plant
species, especially those with significant economic
importance [2-5].
The genus Populus is widely distributed and consists
of many species that play important parts in bio-energy
production, environmental protection and afforestation
on degraded soils [6]. In addition to their conspicuous
economic values, these woody species also exhibit
different degrees of stress resistance as a consequence of
adaptation to different habitats [7,8], thus being very
suitable to address tree-specific questions of salt stress
tolerance [9].
Populus euphratica Oliv. is a salinity tolerant poplar,
which occurs in semiarid and arid areas [10]. It grows
under unfavorable conditions such as saline soils, but
sustains higher photosynthetic and growth rates than
other poplar species under high salinity [3,11]. With the
extraordinary adaptation to salt stress, it has become a
model for elucidating salt resistance mechanisms in trees
[12]. Breeders have tried to increase tree salt tolerance
by crossing P. euphratica with other economical species.
However, successful hybridization with positive features
is scarce [7]. Therefore, molecular breeding provides a
promising alternative.
Based on the recently completed whole genome
sequence of the P. euphratica [12], we have built a
comprehensive web-based database, STPD (http://me.lzu.
edu.cn/stpd/), to facilitate researches on salinity
tolerance and molecular breeding of Populus.
Construction and content
The STPD currently gives public access to P. euphratica
genome assembly version 1.1, which was sequenced and
assembled using the fosmid pooling and hierarchical
approach. The final assembly covers a total length of
496.5 Mb [12], and 34,279 protein-coding genes were
predicted in the whole genome. In addition, 764 transfer
RNAs, 706 ribosomal RNAs, 4,826 small nuclear RNAs
and 266 microRNAs that supported by small RNA
sequencing data were identified and included in the
database. We also incorporated gene expression data, which is
based on time-course profiling of differentially expressed
genes between P. euphratica and Populus tomentosa (a
salt-sensitive poplar) in response to salt stress. Moreover,
a total of 18,938 universal pairs of simple sequence repeat
(SSRs) primers were identified in the syntenic regions of
P. euphratica and Populus trichocarpa, and these SSRs
can be converted into genetic markers across most poplar
species. The STPD includes GBrowse genome browser,
gene search function, BLAST sequence searching and
other intuitive tools to facilitate the analysis of the genetic
data in salt-tolerant Populus (Figure 1).
Genome component
Repeated sequences within the P. euphratica genome were
identified by RepeatMasker (http://www.repeatmasker.
org) with two different libraries. The first one is Repbase
TE library (http://www.girinst.org/repbase) while the
second one was produced by RepeatModeler, which yielded
classification information for each repeat family and
consensus sequences as a repeat library.
Gene component
We obtained the gene dataset of P. euphratica using a
variety of strategies, including RNA-seq, homology and ab
initio gene prediction. The total predicted gene number is
34,279. Among these, 20038 (58.46%) genes are supported
by RNA-seq and 32182 (93.88%) have a homologue either
in Ricinus communis, Cucumis sativus, P. trichocarpa or
Prunus persica [12]. This combined gene dataset was then
used as a reference and has been integrated into the STPD.
The genes were annotated with a variety of databases
including Swiss-Prot/TrEMBL (http://www.uniprot.org),
KEGG (http://www.genome.jp/kegg), InterPro (http://www.
ebi.ac.uk/interpro) and Gene Ontology (http://geneontology.
org). Swiss-Prot and TrEMBL annotations for the
predicted proteins were generated by performing BLASTP
searching (E-value 105) against the Swiss-Prot and
TrEMBL databases. Genes were mapped to KEGG pathway
using KAAS [13]. We also used (...truncated)