Genome Biology

http://genomebiology.com/

List of Papers (Total 5,740)

Editing inducer elements increases A-to-I editing efficiency in the mammalian transcriptome

Adenosine to inosine (A-to-I) RNA editing has been shown to be an essential event that plays a significant role in neuronal function, as well as innate immunity, in mammals. It requires a structure that is largely double-stranded for catalysis but little is known about what determines editing efficiency and specificity in vivo. We have previously shown that some editing sites ...

Alignment-free sequence comparison: benefits, applications, and tools

Alignment-free sequence analyses have been applied to problems ranging from whole-genome phylogeny to the classification of protein families, identification of horizontally transferred genes, and detection of recombined sequences. The strength of these methods makes them particularly useful for next-generation sequencing data processing and analysis. However, many researchers are ...

From structure to function, how bioinformatics help to reveal functions of our genomes

A report on the 13th International Bioinformatics Workshop held in Harbin, China, 5–6 August 2017.

DNA methylation dynamics during early plant life

Cytosine methylation is crucial for gene regulation and silencing of transposable elements in mammals and plants. While this epigenetic mark is extensively reprogrammed in the germline and early embryos of mammals, the extent to which DNA methylation is reset between generations in plants remains largely unknown. Using Arabidopsis as a model, we uncovered distinct DNA methylation ...

Comprehensive benchmarking and ensemble approaches for metagenomic classifiers

One of the main challenges in metagenomics is the identification of microorganisms in clinical and environmental samples. While an extensive and heterogeneous set of computational tools is available to classify microorganisms using whole-genome shotgun sequencing data, comprehensive comparisons of these methods are limited. In this study, we use the largest-to-date set of ...

Topological organization and dynamic regulation of human tRNA genes during macrophage differentiation

The human genome is hierarchically organized into local and long-range structures that help shape cell-type-specific transcription patterns. Transfer RNA (tRNA) genes (tDNAs), which are transcribed by RNA polymerase III (RNAPIII) and encode RNA molecules responsible for translation, are dispersed throughout the genome and, in many cases, linearly organized into genomic clusters ...

Natural genetic variation of the cardiac transcriptome in non-diseased donors and patients with dilated cardiomyopathy

Genetic variation is an important determinant of RNA transcription and splicing, which in turn contributes to variation in human traits, including cardiovascular diseases. Here we report the first in-depth survey of heart transcriptome variation using RNA-sequencing in 97 patients with dilated cardiomyopathy and 108 non-diseased controls. We reveal extensive differences of gene ...

Dynamic DNA methylation reconfiguration during seed development and germination

Unlike animals, plants can pause their life cycle as dormant seeds. In both plants and animals, DNA methylation is involved in the regulation of gene expression and genome integrity. In animals, reprogramming erases and re-establishes DNA methylation during development. However, knowledge of reprogramming or reconfiguration in plants has been limited to pollen and the central cell. ...

Genome build information is an essential part of genomic track files

Genomic locations are represented as coordinates on a specific genome build version, but the build information is frequently missing when coordinates are provided. We show that this information is essential to correctly interpret and analyse the genomic intervals contained in genomic track files. Although not a substitute for best practices, we also provide a tool to predict the ...

Splatter: simulation of single-cell RNA sequencing data

As single-cell RNA sequencing (scRNA-seq) technologies have rapidly developed, so have analysis methods. Many methods have been tested, developed, and validated using simulated datasets. Unfortunately, current simulations are often poorly documented, their similarity to real data is not demonstrated, or reproducible code is not available. Here, we present the Splatter Bioconductor ...

Identification of high-confidence RNA regulatory elements by combinatorial classification of RNA–protein binding sites

Crosslinking immunoprecipitation sequencing (CLIP-seq) technologies have enabled researchers to characterize transcriptome-wide binding sites of RNA-binding protein (RBP) with high resolution. We apply a soft-clustering method, RBPgroup, to various CLIP-seq datasets to group together RBPs that specifically bind the same RNA sites. Such combinatorial clustering of RBPs helps ...

Non-base-contacting residues enable kaleidoscopic evolution of metazoan C2H2 zinc finger DNA binding

The C2H2 zinc finger (C2H2-ZF) is the most numerous protein domain in many metazoans, but is not as frequent or diverse in other eukaryotes. The biochemical and evolutionary mechanisms that underlie the diversity of this DNA-binding domain exclusively in metazoans are, however, mostly unknown. Here, we show that the C2H2-ZF expansion in metazoans is facilitated by contribution of ...

Chrom3D: three-dimensional genome modeling from Hi-C and nuclear lamin-genome contacts

Current three-dimensional (3D) genome modeling platforms are limited by their inability to account for radial placement of loci in the nucleus. We present Chrom3D, a user-friendly whole-genome 3D computational modeling framework that simulates positions of topologically-associated domains (TADs) relative to each other and to the nuclear periphery. Chrom3D integrates chromosome ...

Human disease genomics: from variants to biology

We summarize the remarkable progress that has been made in the identification and functional characterization of DNA sequence variants associated with disease.

Discovery and functional prioritization of Parkinson’s disease candidate genes from large-scale whole exome sequencing

Background Whole-exome sequencing (WES) has been successful in identifying genes that cause familial Parkinson’s disease (PD). However, until now this approach has not been deployed to study large cohorts of unrelated participants. To discover rare PD susceptibility variants, we performed WES in 1148 unrelated cases and 503 control participants. Candidate genes were subsequently ...

Correcting for cell-type effects in DNA methylation studies: reference-based method outperforms latent variable approaches in empirical studies

Based on an extensive simulation study, McGregor and colleagues recently recommended the use of surrogate variable analysis (SVA) to control for the confounding effects of cell-type heterogeneity in DNA methylation association studies in scenarios where no cell-type proportions are available. As their recommendation was mainly based on simulated data, we sought to replicate ...

Genome-wide mapping of 5-hydroxymethyluracil in the eukaryote parasite Leishmania

Background 5-Hydroxymethyluracil (5hmU) is a thymine base modification found in the genomes of a diverse range of organisms. To explore the functional importance of 5hmU, we develop a method for the genome-wide mapping of 5hmU-modified loci based on a chemical tagging strategy for the hydroxymethyl group. Results We apply the method to generate genome-wide maps of 5hmU in the ...

Response to: Correcting for cell-type effects in DNA methylation studies: reference-based method outperforms latent variable approaches in empirical studies

We thank Hattab and colleagues for their correspondence and their investigation of cell-type mixture correction methods in methyl-CG binding domain sequencing. Here, we speculate on why surrogate variable analysis (SVA) performed differently between their two data sets, and poorly in one of them. Please see related Correspondence article: ...

Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution

We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control ...

Genome-wide analysis of differential transcriptional and epigenetic variability across human immune cell types

Background A healthy immune system requires immune cells that adapt rapidly to environmental challenges. This phenotypic plasticity can be mediated by transcriptional and epigenetic variability. Results We apply a novel analytical approach to measure and compare transcriptional and epigenetic variability genome-wide across CD14 + CD16 − monocytes, CD66b + CD16 + neutrophils, and ...

Allele-specific analysis of cell fusion-mediated pluripotent reprograming reveals distinct and predictive susceptibilities of human X-linked genes to reactivation

Background Inactivation of one X chromosome is established early in female mammalian development and can be reversed in vivo and in vitro when pluripotency factors are re-expressed. The extent of reactivation along the inactive X chromosome (Xi) and the determinants of locus susceptibility are, however, poorly understood. Here we use cell fusion-mediated pluripotent reprograming to ...

Estimating and accounting for tumor purity in the analysis of DNA methylation data from cancer studies

We present a set of statistical methods for the analysis of DNA methylation microarray data, which account for tumor purity. These methods are an extension of our previously developed method for purity estimation; our updated method is flexible, efficient, and does not require data from reference samples or matched normal controls. We also present a method for incorporating purity ...

Integrated genome-wide analysis of expression quantitative trait loci aids interpretation of genomic association studies

Background Identification of single nucleotide polymorphisms (SNPs) associated with gene expression levels, known as expression quantitative trait loci (eQTLs), may improve understanding of the functional role of phenotype-associated SNPs in genome-wide association studies (GWAS). The small sample sizes of some previous eQTL studies have limited their statistical power. We ...