RNA-Seq Based De Novo Transcriptome Assembly and Gene Discovery of Cistanche deserticola Fleshy Stem

PLOS ONE, Dec 2019

Backgrounds Cistanche deserticola is a completely non-photosynthetic parasitic plant with great medicinal value and mainly distributed in desert of Northwest China. Its dried fleshy stem is a crucial tonic in traditional Chinese medicine with roles of mainly improving male sexual function and strengthening immunity, but few mechanistic studies have been conducted partly due to the lack of genomic and transcriptomic resources. Results In this study, we performed deep transcriptome sequencing in fleshy stem of C. deserticola, and about 80 million reads were generated using Illumina pair-end sequencing on HiSeq2000 platform. Using trinity assembler, we obtained 95,787 transcript sequences with transcript lengths ranging from 200bp to 15,698bp, having an average length of 950 bases and the N50 length of 1,519 bases. 63,957 transcripts were identified actively expressed with FPKM ≥ 0.5, in which 30,098 transcripts were annotated with gene descriptions or gene ontology terms by sequence similarity analyses against several public databases (Uniprot, NR and Nt at NCBI, and KEGG). Furthermore, we identified key enzyme genes involved in biosynthesis of lignin and phenylethanoid glycosides (PhGs) which are known to be the primary active ingredients. Four phenylalanine ammonia-lyase (PAL) genes, the first key enzyme in lignin and PhG biosynthesis, were identified based on sequences comparison and phylogenetic analysis. Two biosynthesis pathways of PhGs were also proposed for the first time. Conclusions In all, we completed a global analysis of the C. deserticola fleshy stem transcriptome using RNA-seq technology. A collection of enzyme genes related to biosynthesis of lignin and phenylethanoid glysides were identified from the assembled and annotated transcripts, and the gene family of PAL was also predicted. The sequence data from this study will provide a valuable resource for conducting future phenylethanoid glysides biosynthesis researches and functional genomic studies in this important medicinal plant.

RNA-Seq Based De Novo Transcriptome Assembly and Gene Discovery of Cistanche deserticola Fleshy Stem

May RNA-Seq Based De Novo Transcriptome Assembly and Gene Discovery of Cistanche deserticola Fleshy Stem Yuli Li 0 1 2 Xiliang Wang 0 1 2 Tingting Chen 0 1 2 Fuwen Yao 0 1 2 Cuiping Li 0 1 2 Qingli Tang 0 1 2 Min Sun 0 1 2 Gaoyuan Sun 0 1 2 Songnian Hu 0 1 2 Jun Yu 0 1 2 Shuhui Song 0 1 2 0 1 CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences , Beijing, China, 2 Core Genomic Facility , Beijing Institute of Genomics, Chinese Academy of Sciences , Beijing , China , 3 University of Chinese Academy of Sciences , Beijing, China, 4 HongKui CongRong Group, Alashan, Inner Mongolia , China 1 Funding: HongKui CongRong Group provided support in the form of salaries for author QT, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific role of this author is articulated in the 'author contributions' section 2 Academic Editor: Zhong-Hua Chen, University of Western Sydney , AUSTRALIA Cistanche deserticola is a completely non-photosynthetic parasitic plant with great medicinal value and mainly distributed in desert of Northwest China. Its dried fleshy stem is a crucial tonic in traditional Chinese medicine with roles of mainly improving male sexual function and strengthening immunity, but few mechanistic studies have been conducted partly due to the lack of genomic and transcriptomic resources. - Competing Interests: Although one of the authors (QT) is employed by a commercial company In this study, we performed deep transcriptome sequencing in fleshy stem of C. deserticola, and about 80 million reads were generated using Illumina pair-end sequencing on HiSeq2000 platform. Using trinity assembler, we obtained 95,787 transcript sequences with transcript lengths ranging from 200bp to 15,698bp, having an average length of 950 bases and the N50 length of 1,519 bases. 63,957 transcripts were identified actively expressed with FPKM 0.5, in which 30,098 transcripts were annotated with gene descriptions or gene ontology terms by sequence similarity analyses against several public databases (Uniprot, NR and Nt at NCBI, and KEGG). Furthermore, we identified key enzyme genes involved in biosynthesis of lignin and phenylethanoid glycosides (PhGs) which are known to be the primary active ingredients. Four phenylalanine ammonia-lyase (PAL) genes, the first key enzyme in lignin and PhG biosynthesis, were identified based on sequences comparison and phylogenetic analysis. Two biosynthesis pathways of PhGs were also proposed for the first time. In all, we completed a global analysis of the C. deserticola fleshy stem transcriptome using RNA-seq technology. A collection of enzyme genes related to biosynthesis of lignin and phenylethanoid glysides were identified from the assembled and annotated transcripts, and the gene family of PAL was also predicted. The sequence data from this study will provide a valuable resource for conducting future phenylethanoid glysides biosynthesis researches and functional genomic studies in this important medicinal plant. C. deserticola is a worldwide genus of perennial desert plants from the Orobanchaceae family, and is a completely non-photosynthetic species and usually grows underground holoparasitic plant [1]. It is parasitized on the roots of psammophyte Haloxylon ammodendron (Chenopodiaceae) [1, 2], which mainly inhabits deserts and semi-deserts due to its high tolerance to drought and salinity [1, 3]. C. deserticola shows strong resistance to harsh environmental conditions and is mainly distributed in Northwest China [46], especially in Inner Mongolia, Gansu and Xinjiang. It is considered to be an endangered wild species in recent years due to increased consumption by humans [5, 6]. C. deserticola which is often called desert ginseng is commonly known as desert-broomrape and the dried fleshy stem has been extensively used as a traditionally important tonic in China and Japan for many years [4, 710]. It was initially recorded in Shen Nong Ben Cao Jing (Dictionary of Chinese Materia Medica, 1977) [11] approximately 1800 years ago and was regarded as one of the main sources of the Chinese medicinal herba Cistanche. The extracts of C. deserticola possess a wide range of medicinal functions, especially for use in improving sexual function, tonifying kidney, protecting liver, aperient activity, enhancing memory, immunomodulatory, antioxidative activity, anti-inflammatory, antiviral activity etc [710, 1215]. The major bioactive components of C. deserticola are Phenylethanoid glycosides (PheGs, PhGs) [2, 9, 10, 14, 15]. To date, more than 20 phenylethanoid glycosides have been isolated from the succulent stem of C.deserticola [9, 14, 16]. Among them, acteoside and echinacoside are two main components with significant pharmacological activities [2, 16], and documented as the quality standards of C. deserticola in the Chinese pharmacopeia (2005 and 2010 editions). Three chemical components of PhGs are organic acid, saccharide and phenylethanoid, however, the details concerning phenylethanoid biosynthetic pathways remain poorly understood in C.deserticola. Despite the commercial and medicinal importance of C.deserticola, the genomic and transcriptomic data of this species are very limited. There is no ESTs available in the NCBI database and the complete genome information for this species remains unavailable except for the chloroplast genome sequence [1]. The limited transcriptomic data hinder the study of PhG biosynthetic mechanisms. RNA-seq technology can generate sequences of the expressed parts of targeted genome [17] and identify genes [18] using the NGS technology platforms (such as Applied Biosystems SOLiD, Illumina HiSeq and Roche 454). It is becoming increasingly popular in transcriptome de novo assembly [1922], since it is a cost-effective and powerful approach with high resolution and broad dynamic range [2325], especially that it has an advantage to explore low abundance transcripts [26]. Because of the various advantages, RNA-seq is specifically attractive for non-model organisms with limited genetic resources [2729]. But there is no any detailed research of C. deserticola transcriptome by RNA-seq. In this study, we globally sequenced the stem transcriptome for C. deserticola using Illumina Hiseq2000 platform, and got 7.9G raw data. By assembly and annotation, we mined the genes involved in biosynthesis of PhG and the genes responsible for entire lignin biosynthesis. Our RNA-seq analysis generated the first C. deserticola consensus trancriptome and provided new insights into comprehensive understanding of the medicinal value of C. deserticola. Additionally, the method described here can be widely applied to profile transcriptomes to facilitate the discovery of genes involved in specific medicinal components biosynthesis pathway in other medicinal plant with very limited genomic resources. Mate (...truncated)


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0125722&type=printable
Article home page: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0125722

Yuli Li, Xiliang Wang, Tingting Chen, Fuwen Yao, Cuiping Li, Qingli Tang, Min Sun, Gaoyuan Sun, Songnian Hu, Jun Yu, Shuhui Song. RNA-Seq Based De Novo Transcriptome Assembly and Gene Discovery of Cistanche deserticola Fleshy Stem, PLOS ONE, 2015, Volume 10, Issue 5, DOI: 10.1371/journal.pone.0125722