tDRmapper: challenges and solutions to mapping, naming, and quantifying tRNA-derived RNAs from human small RNA-sequencing data

Nov 2015

Background Small RNA-sequencing has revealed the diversity and high abundance of small RNAs derived from tRNAs, referred to as tRNA-derived RNAs. However, at present, there is no standardized nomenclature and there are no methods for accurate annotation and quantification of these small RNAs. tRNA-derived RNAs have unique features that limit the utility of conventional alignment tools and quantification methods. Results We describe here the challenges of mapping, naming, and quantifying tRNA-derived RNAs and present a novel method that addresses them, called tDRmapper. We then use tDRmapper to perform a comparative analysis of tRNA-derived RNA profiles across different human cell types and diseases. We found that (1) tRNA-derived RNA profiles can differ dramatically across different cell types and disease states, (2) that positions and types of chemical modifications of tRNA-derived RNAs vary by cell type and disease, and (3) that entirely different tRNA-derived RNA species can be produced from the same parental tRNA depending on the cell type. Conclusion tDRmappernot only provides a standardized nomenclature and quantification scheme, but also includes graphical visualization that facilitates the discovery of novel tRNA and tRNA-derived RNA biology.

Article PDF cannot be displayed. You can download it here:

http://www.biomedcentral.com/content/pdf/s12859-015-0800-0.pdf

tDRmapper: challenges and solutions to mapping, naming, and quantifying tRNA-derived RNAs from human small RNA-sequencing data

Selitsky and Sethupathy BMC Bioinformatics (2015) 16:354 DOI 10.1186/s12859-015-0800-0 METHODOLOGY ARTICLE Open Access tDRmapper: challenges and solutions to mapping, naming, and quantifying tRNA-derived RNAs from human small RNA-sequencing data Sara R. Selitsky1,2,3* and Praveen Sethupathy1,2,4 Abstract Background: Small RNA-sequencing has revealed the diversity and high abundance of small RNAs derived from tRNAs, referred to as tRNA-derived RNAs. However, at present, there is no standardized nomenclature and there are no methods for accurate annotation and quantification of these small RNAs. tRNA-derived RNAs have unique features that limit the utility of conventional alignment tools and quantification methods. Results: We describe here the challenges of mapping, naming, and quantifying tRNA-derived RNAs and present a novel method that addresses them, called tDRmapper. We then use tDRmapper to perform a comparative analysis of tRNA-derived RNA profiles across different human cell types and diseases. We found that (1) tRNA-derived RNA profiles can differ dramatically across different cell types and disease states, (2) that positions and types of chemical modifications of tRNA-derived RNAs vary by cell type and disease, and (3) that entirely different tRNA-derived RNA species can be produced from the same parental tRNA depending on the cell type. Conclusion: tDRmappernot only provides a standardized nomenclature and quantification scheme, but also includes graphical visualization that facilitates the discovery of novel tRNA and tRNA-derived RNA biology. Keywords: tRNA, tDR, Sequencing, RNA modifications, Bioinformatics Background Transfer RNAs (tRNAs) are non-coding RNAs that deliver amino acids to ribosomes during translation. tRNA-derived RNAs (tDRs) are small RNAs that are enzymatically processed from either nascent pre-tRNA transcripts or mature tRNAs [1]. Their regulated biogenesis and well-defined 5′ and 3′ ends indicate that they are not products of tRNA degradation [2]. tDRs are generated in organisms from all domains of life [3]. They are derived from most tRNA genes and produced in varying abundance, in a variety of different sizes, and from different regions of the tRNA. Several functions have been attributed to tDRs such as post-transcriptional [4, 5] and translational repression [6], stress granule formation * Correspondence: 1 Bioinformatics and Computational Biology Curriculum, University of North Carolina, Chapel Hill, NC, USA 2 Departments of Genetics, University of North Carolina, Chapel Hill, NC, USA Full list of author information is available at the end of the article [7], and protection from apoptosis [8]; however, all of these have been in the context of cell culture. The role of tDRs in human health is only now starting to emerge. tDRs may play a role in neurodegeneration [9], cancer [5, 10], as well as immune modulation [11], and we previously showed that tDRs are significantly increased in the liver tissue of patients with chronic viral hepatitis and decreased in liver cancer [2]. Despite the potential biomedical significance of tDRs, the field is lagging behind other small RNA fields in terms of genomic annotation and strategies for quantification from small RNA-sequencing (small RNA-seq) data. This is due in large part to: (a) the unique computational challenges of mapping tDRs from small RNAseq data and (b) the lack of a standardized nomenclature for tDRs. © 2015 Selitsky and Sethupathy. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Selitsky and Sethupathy BMC Bioinformatics (2015) 16:354 a) While small RNA-seq has enabled the discovery of tDRs, these small RNAs are difficult to map accurately for at least three reasons: (1)Exact copies of tRNA genes are present in numerous locations throughout the human genome, and annotation of tRNAs in the human genome is still incomplete. This means that small RNA-seq reads corresponding to tRNA-derived RNAs can map with equal fidelity to numerous locations throughout the genome (multi-mapping), which leads to ambiguity about the precise origin of the tRNA-derived RNA. (2)tDRs are derived from both the nascent pre-tRNAs and the processed, mature tRNAs. The maturation process of eukaryotic tRNAs includes several steps, such as the removal of 5′ leader sequence and 3′ trailer sequence, the addition of a non-templated “CCA” to the 3′ end, and the excision of introns. These changes during maturation need to be accounted for when mapping tRNA-derived RNA reads. For example, spliced reads (those derived from the sequence flanking the spliced intron) or reads that contain a non-templated 3′-end “CCA” will not map to the genome. (3)tRNAs are subject to extensive chemical modifications at specific nucleotide positions during maturation [12]. As a result, tDRs most likely harbor these modifications, which can lead to errors during cDNA synthesis due to reverse transcriptase pausing and mis-incorporation of nucleosides [13]. These errors manifest in small RNA-seq reads as mismatches and deletions relative to the reference tRNA sequence. These mismatches/deletions will be referred to as “error type.” b) There is no standardized nomenclature for tDRs. tDRs are produced in a variety of different sizes, from a variety of different tRNAs, and from a variety of different locations within the tRNAs, all of which present challenges for a coherent naming system. A standard naming scheme is critical to facilitate future research. For example, it is at present extremely difficult to use published studies to define the bio-distribution of specific tDRs (the tissues and conditions in which specific tDRs are expressed) because the same or similar tDR is often referred to by completely different names (and in some cases a name is not given at all). In this study we introduce a tool designed to address the challenges of mapping, naming, and quantifying tDRs, called tDRmapper. tDRmapper was Page 2 of 13 designed specifically for human small RNA-seq data (single-end, 50x) generated on the Illumina sequencing platform using cDNA libraries that were prepared using the Illumina TruSeq protocol. We used tDRmapper to analyze publically available small RNAseq datasets (total n = 45) from four categories of cell types/tissues. These analyses helped shape the final version of the tool and also led to the discovery of new types of tDR species as well as novel insights about potentially varying pa (...truncated)


This is a preview of a remote PDF: http://www.biomedcentral.com/content/pdf/s12859-015-0800-0.pdf
Article home page: http://www.biomedcentral.com/1471-2105/16/354

Sara Selitsky, Praveen Sethupathy. tDRmapper: challenges and solutions to mapping, naming, and quantifying tRNA-derived RNAs from human small RNA-sequencing data, 2015, pp. 354, 16, DOI: 10.1186/s12859-015-0800-0