Genome of Tenualosa ilisha from the river Padma, Bangladesh

BMC Research Notes, Dec 2018

Objective Hilsa shad (Tenualosa ilisha), is a popular fish of Bangladesh belonging to the Clupeidae family. An anadromous species, like the salmon and many other migratory fish, it is a unique species that lives in the sea and travels to freshwater rivers for spawning. During its entire life, Tenualosa ilisha migrates both from sea to freshwater and vice versa. Data description The genome of Tenualosa ilisha collected from the river Padma of Rajshahi, Bangladesh has been sequenced and its de novo hybrid assembly and structural annotations are being reported here. Illumina and PacBio sequencing platforms were used for high depth sequencing and the draft genome assembly was found to be 816 MB with N50 size of 188 kb. MAKER gene annotation tool predicted 31,254 gene models. Benchmarking Universal Single-Copy Orthologs refer 95% completeness of the assembled genome.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1186%2Fs13104-018-4028-8.pdf

Genome of Tenualosa ilisha from the river Padma, Bangladesh

BMC Research Notes December 2018, 11:921 | Cite as Genome of Tenualosa ilisha from the river Padma, Bangladesh AuthorsAuthors and affiliations Avizit DasPeter IanakievAbdul BatenRifath NehleenTasneem EhsanOly AhmedMohammad Riazul IslamM. Niamul NaserMong Sano MarmaHaseena Khan Open Access Data note First Online: 22 December 2018 Abstract Objective Hilsa shad (Tenualosa ilisha), is a popular fish of Bangladesh belonging to the Clupeidae family. An anadromous species, like the salmon and many other migratory fish, it is a unique species that lives in the sea and travels to freshwater rivers for spawning. During its entire life, Tenualosa ilisha migrates both from sea to freshwater and vice versa. Data description The genome of Tenualosa ilisha collected from the river Padma of Rajshahi, Bangladesh has been sequenced and its de novo hybrid assembly and structural annotations are being reported here. Illumina and PacBio sequencing platforms were used for high depth sequencing and the draft genome assembly was found to be 816 MB with N50 size of 188 kb. MAKER gene annotation tool predicted 31,254 gene models. Benchmarking Universal Single-Copy Orthologs refer 95% completeness of the assembled genome. KeywordsHilsa shad Tenualosa ilisha Clupediae Whole genome sequence NGS platform  Abbreviations BUSCO Benchmarking Universal Single-Copy Orthologs PacBio Pacific Bioscience Gbp giga base pair Mb mega base pair Kb kilo base pair bp base pair GO gene ontology SDS sodium dodecyl sulfate EDTA ethylenediaminetetraacetic acid qPCR quantitative polymerase chain reaction SMRT single molecule real time sequencing MaSuRCA Maryland Super-Read Celera Assembler EST expressed sequenced tag SNAP Semi-HMM-based Nucleic Acid Parser Objective Hilsa shad known as ilish in Bangladeshis popular for its taste and the texture of its flesh. This species of fish belongs to the shad in Clupeidae family. In addition to the Bay of Bengal and riverine Bangladesh (the Padma, Jamuna, Meghna, and other coastal rivers of Bangladesh), this fish is also found in the Persian Gulf, Mediterranean Sea, Arabian Sea and China Sea [1]. Fisheries, a part of the Bangladesh’s cultural heritage, have played an important role on its socioeconomic development in terms of protein supply, generation of employment and earning of foreign currency. According to the FAO, in 2018 Bangladesh ranked 3rd in the world in inland fish production. Hilsa (Tenualosa ilisha), is the most popular among the 650 or so marine and inland fish found in Bangladesh. It contributes to 11% of total fish production and 1% to the national GDP, 3.00% of the total export earnings and about 2.5 million people in Bangladesh are directly dependent on Hilsa in providing for their families [2, 3]. At present more than 60% of global Hilsa catch is reported from Bangladesh, 20–25% from Myanmar, 15–20% from India and 5–10% from other countries (e.g., Iraq, Kuwait, Malaysia, Thailand and Pakistan). The recent Hilsa production of Bangladesh is about half a million metric ton [4]. In spite of such importance Hilsa is still lacks molecular genomic information. Significance of this data for the improvement in sustainability and maintenance of diversity of this fish cannot therefore be overemphasized. Data description Fresh Tenualosa ilisha samples from the river Padma at Rajshahi were collected and instantly preserved on dry ice. White and red muscles of the fish were used for DNA extraction. A modified SDS (Sodium Dodecyl Sulfate) method [5], optimized in our lab was used for DNA extraction (detailed methodology in Data File 1, Table 1). Table 1 Overview of data files/data sets Label Name of data file/data set File types (file extension) Data repository and identifier (DOI or accession number) Data file 1 DNA isolation and library preparation methodology .docs file https://figshare.com/s/467b8b670149f1a0617c Data file 2 Whole genome assembly data FASTA NCBI GeneBank (Accession numbers: GCA_003651195.1) (http://identifiers.org/ncbi/insdc.gca:GCA_003651195.1.) Data file 3 Whole genome sequence FASTA NCBI GeneBank (Accession numbers: QYSC01000001–QYSC01124209) (http://identifiers.org/ncbi/insdc:QYSC00000000.) Data file 4 Annotation data file .tsv https://figshare.com/s/270b54d9d076ef5e5901 Pair end library with an insert size of around 300 bp was constructed for Illumina sequencing using NEB NebNext Ultra II DNA kit (detailed methodology in Data File 1, Table 1) Genomic DNA was sequenced by Illumina HiSeq 4000 and Pacific Bioscience Sequel, single molecule, real time (SMRT, Single Molecule Real Time) sequencing platforms. The quality of the reads were checked using FastQC [6]. MaSuRCA (Maryland Super-Read Celera Assembler) ver 3.2.6 was used for hybrid de novo assembly [7] using both the Illumina and PacBio data. The genome assembly data has been deposited in the NCBI GeneBank under the Accession numbers GCA_003651195.1 (Data file 2; Table 1). Illumina only data generated a fragmented assembly and showed 91% BUSCO [8] completeness. Addition of 15.7 Gbp data from PacBio significantly improved the quality and contiguity of the genome. Compared to Illumina only, N50 improved from 13 Kb (kilo base pair) to 188 Kb. Similarly, the total number of scaffolds reduced from 475,121 to 124,209. The assembled genome size of Tenualosa ilisha Padma Bangladesh is now 816 Mb (Mega base pair) and approximately 82% of the genome has been assembled. The BUSCO analysis revealing 95% completeness as well as significantly lower number of scaffolds and considerably better N50 indicates the genome to be of high-quality. The genome sequence data has been deposited in the NCBI GeneBank under the Accession numbers QYSC01000001-QYSC01124209 (Data file 3; Table 1). MAKER ver 3.0 pipeline [9] was used for structural annotation. GC content of the genome was determined to be 43.61%. RepeatMasker and Repeatmodeler using the latest version of repbase database [10, 11, 12] identified 27.27% repeat elements. Altogether, 31,254 gene models were predicted using the MAKER gene annotation pipeline based on both de novo and reference based predictions using genes/proteins from other fish species (Atlantic herring, carp, salmon, zebrafish). Out of the 31,254 genes, 24,648 were annotated using InterProScan [13] and 16,078 genes were found to have at least 1 GO (Gene Ontology) term assigned to them (Data file 4, Table 1). The Hilsa genome was found to be comparable to the Atlantic herring (807 Mb genome and 28,335 genes) [14] and to the genome of the common carp (1.8 Gb and 52,000 genes) [15]. Limitations The number of the regions unassembled in the genome is 4605 and the total number of bases positioned in this gap is 2,268,925 bp. Notes Authors’ contributions HK and MSM initiated the project. HK, MSM, MRI, PI, MNN and AD designed the overall project. HK and MRI led the project. AD and OA collected the samples with the help of MNN.AD and OA extracted the DNA. PI sequenced the Tenualosa ilisha Padma BD genome. AB assembled the genome and performed the structural and functional annotations. TE and RN performed the repeat and GC content analysis. HK, MRI and AD wrote the manuscript. HK, MRI, MSM, PI, AB, MNN, AD, OA, TE, RN reviewed the manuscript. All authors read and approved the final manuscript. Acknowledgements Authors concede the support of Hera Biosciences for the sequencing service and that of Southern Cross University, Lismore, Australia for the computational support. Competing interests The authors declare that they have no competing interests. Consent for publication Not applicable. Data availability The genome sequence data has been available at DDBJ/ENA/GenBank under the Accession numbers QYSC01000001-QYSC01124209 and the assembled genome at GCA_003651195.1. The version described in this paper is the first version, QYSC00000000.1. Ethics approval and consent to participate The experiments mentioned in this study have been approved by the institutional review committee of University of Dhaka. Funding This study did not receive any formal funding. Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. References 1. Naser MN. Hilsa Shad (Tenualosa ilisha)—the iconic fish of the Bengal Delta. In: Nishat B, Mandal S, Pangare G, editors. Conserving Ilish, securing livelihoods: Bangladesh—India initiatives. India: IWA, Academic Foundation; 2018. p. 37–52.Google Scholar 2. Sarker JM, Uddin AMMB, Patwary SAM, Tanmay MH, Rahman F, et al. Livelihood status of Hilsa (Tenualosa ilisha) fishermen of greater Noakhali regions of Bangladesh. Fish Aquac J. 2016;7:168.  https://doi.org/10.4172/2150-3508.1000168.CrossRefGoogle Scholar 3. FRSS. Yearbook of fisheries statistics of Bangladesh. Dhaka: Dept of Fisheries, GoB; 2017. p. 116.Google Scholar 4. Ahsan D, Naser N, Bhoumik U, Hazra S, Bhattacharya SB. Migration, spawning patterns and conservation of Hilsa shad (Tenualosa ilisha) in Bangladesh and India. 2016; Academic Foundation.Google Scholar 5. Kumar R, Singh PJ, Nagpure NS, Kushwaha B, Srivastava SK, Lakra WS. A non-invasive technique for rapid extraction of DNA from fish scales. Indian J Exp Biol. 2007;45(11):992–7.PubMedGoogle Scholar 6. FastQC program. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Accessed 12 Jan 2017. 7. Zimin AV, Marçais G, Puiu D, Roberts M, Salzberg SL, Yorke JA. The MaSuRCA genome assembler. Bioinformatics. 2013;29(21):2669–77.CrossRefGoogle Scholar 8. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31(19):3210–2.CrossRefGoogle Scholar 9. Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, Holt C, Alvarado AS, Yandell M. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 2008;18(1):188–96.CrossRefGoogle Scholar 10. Smit A, Hubley R, Green P. RepeatMasker open-4.0. 2013–2015. Seattle, WA, USA: Institute for Systems Biology; 2015. http://www.repeatmasker.org/faq.html.Google Scholar 11. Smit A, Hubley R. RepeatModeler open 1.0. Seattle, WA, USA: Institute for Systems Biology; 2008. http://www.repeatmasker.org//RepeatModeler/.Google Scholar 12. Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J. Repbase update, a database of eukaryotic repetitive elements. Cytogen Genome Res. 2005;110(1–4):462–7.CrossRefGoogle Scholar 13. Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R. InterProScan: protein domains identifier. Nucleic Acids Res. 2005;33(Suppl 2):W116–20.CrossRefGoogle Scholar 14. Martinez Barrio A, Lamichhaney S, Fan G, Rafati N, Pettersson M, Zhang H, Dainat J, Ekman D, Höppner M, Jern P, Martin M, Nystedt B, Liu X, Chen W, Liang X, Shi C, Fu Y, Ma K, Zhan X, Feng C, Gustafson U, Rubin CJ, SällmanAlmén M, Blass M, Casini M, Folkvord A, Laikre L, Ryman N, Ming-Yuen Lee S, Xu X, Andersson L. The genetic basis for ecological adaptation of the Atlantic herring revealed by genome sequencing. Elife. 2016;3(5):e12081.CrossRefGoogle Scholar 15. Xu P, Zhang X, Wang X, Li J, Liu G, Kuang Y, Xu J, Zheng X, Ren L, Wang G, Zhang Y, et al. Genome sequence and genetic diversity of the common carp, Cyprinuscarpio. Nat Genet. 2014;46(11):1212.CrossRefGoogle Scholar Copyright information © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Authors and Affiliations Avizit Das1Peter Ianakiev2Abdul Baten34Rifath Nehleen1Tasneem Ehsan1Oly Ahmed1Mohammad Riazul Islam1M. Niamul Naser5Mong Sano Marma6Haseena Khan1Email author1.Department of Biochemistry and Molecular BiologyUniversity of DhakaDhakaBangladesh2.Hera Biosciences LLCMedfordUSA3.AgResearch, Grasslands Research CentrePalmerston NorthNew Zealand4.Southern Cross Plant ScienceSouthern Cross UniversityLismoreAustralia5.Department of ZoologyUniversity of DhakaDhakaBangladesh6.Qiagen SciencesWalthamUSA


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1186%2Fs13104-018-4028-8.pdf

Avizit Das, Peter Ianakiev, Abdul Baten, Rifath Nehleen, Tasneem Ehsan, Oly Ahmed, Mohammad Riazul Islam, M. Niamul Naser, Mong Sano Marma, Haseena Khan. Genome of Tenualosa ilisha from the river Padma, Bangladesh, BMC Research Notes, 2018, 921, DOI: 10.1186/s13104-018-4028-8