Advanced search    

Search: authors:"Li-San Wang"

30 papers found.
Use AND, OR, NOT, +word, -word, "long phrase", (parentheses) to fine-tune your search.

McEnhancer: predicting gene expression via semi-supervised assignment of enhancers to target genes

Transcriptional enhancers regulate spatio-temporal gene expression. While genomic assays can identify putative enhancers en masse, assigning target genes is a complex challenge. We devised a machine learning approach, McEnhancer, which links target genes to putative enhancers via a semi-supervised learning algorithm that predicts gene expression patterns based on enriched...

A comprehensive database of high-throughput sequencing-based RNA secondary structure probing data (Structure Surfer)

Background RNA molecules fold into complex three-dimensional shapes, guided by the pattern of hydrogen bonding between nucleotides. This pattern of base pairing, known as RNA secondary structure, is critical to their cellular function. Recently several diverse methods have been developed to assay RNA secondary structure on a transcriptome-wide scale using high-throughput...

DASHR: database of small human noncoding RNAs

Small non-coding RNAs (sncRNAs) are highly abundant RNAs, typically <100 nucleotides long, that act as key regulators of diverse cellular processes. Although thousands of sncRNA genes are known to exist in the human genome, no single database provides searchable, unified annotation, and expression information for full sncRNA transcripts and mature RNA products derived from these...

Transcriptomic Changes Due to Cytoplasmic TDP-43 Expression Reveal Dysregulation of Histone Transcripts and Nuclear Chromatin

TAR DNA-binding protein 43 (TDP-43) is normally a nuclear RNA-binding protein that exhibits a range of functions including regulation of alternative splicing, RNA trafficking, and RNA stability. However, in amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration with TDP-43 inclusions (FTLD-TDP), TDP-43 is abnormally phosphorylated, ubiquitinated, and cleaved...

Analysis of Nonlinear Gene Expression Progression Reveals Extensive Pathway and Age-Specific Transitions in Aging Human Brains

Wang 0 Hitoshi Okazawa, Tokyo Medical and Dental University, Japan 0 1 Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America

CoRAL: predicting non-coding RNAs from small RNA-sequencing data

The surprising observation that virtually the entire human genome is transcribed means we know little about the function of many emerging classes of RNAs, except their astounding diversities. Traditional RNA function prediction methods rely on sequence or alignment information, which are limited in their abilities to classify the various collections of non-coding RNAs (ncRNAs...

High-throughput identification of long-range regulatory elements and their target promoters in the human genome

Enhancer elements are essential for tissue-specific gene regulation during mammalian development. Although these regulatory elements are often distant from their target genes, they affect gene expression by recruiting transcription factors to specific promoter regions. Because of this long-range action, the annotation of enhancer element–target promoter pairs remains elusive...

SAVoR: a server for sequencing annotation and visualization of RNA structures

RNA secondary structure is required for the proper regulation of the cellular transcriptome. This is because the functionality, processing, localization and stability of RNAs are all dependent on the folding of these molecules into intricate structures through specific base pairing interactions encoded in their primary nucleotide sequences. Thus, as the number of RNA sequencing...

DACTAL: divide-and-conquer trees (almost) without alignments

Motivation: While phylogenetic analyses of datasets containing 1000–5000 sequences are challenging for existing methods, the estimation of substantially larger phylogenies poses a problem of much greater complexity and scale. Methods: We present DACTAL, a method for phylogeny estimation that produces trees from unaligned sequence datasets without ever needing to estimate an...

Genome-wide association reveals genetic effects on human Aβ42 and τ protein levels in cerebrospinal fluids: a case control study

Background Alzheimer's disease (AD) is common and highly heritable with many genes and gene variants associated with AD in one or more studies, including APOE ε2/ε3/ε4. However, the genetic backgrounds for normal cognition, mild cognitive impairment (MCI) and AD in terms of changes in cerebrospinal fluid (CSF) levels of Aβ1-42, T-tau, and P-tau181P, have not been clearly...

DRAW+SneakPeek: Analysis workflow and quality metric management for DNA-seq experiments

Summary: We report our new DRAW+SneakPeek software for DNA-seq analysis. DNA resequencing analysis workflow (DRAW) automates the workflow of processing raw sequence reads including quality control, read alignment and variant calling on high-performance computing facilities such as Amazon elastic compute cloud. SneakPeek provides an effective interface for reviewing dozens of...

Evolutionary genomics of host-use in bifurcating demes of RNA virus phi-6

Background Viruses are exceedingly diverse in their evolved strategies to manipulate hosts for viral replication. However, despite these differences, most virus populations will occasionally experience two commonly-encountered challenges: growth in variable host environments, and growth under fluctuating population sizes. We used the segmented RNA bacteriophage ϕ6 as a model for...

A Bayesian approach to efficient differential allocation for resampling-based significance testing

Background Large-scale statistical analyses have become hallmarks of post-genomic era biological research due to advances in high-throughput assays and the integration of large biological databases. One accompanying issue is the simultaneous estimation of p-values for a large number of hypothesis tests. In many applications, a parametric assumption in the null distribution such...

Enhanced position weight matrices using mixture models

Sridhar Hannenhalli 1 Li-San Wang 0 0 Department of Biology, University of Pennsylvania , Philadelphia, PA 19104, USA 1 Department of Genetics Motivation: Positional weight matrix (PWM) is derived

Age-Correlated Gene Expression in Normal and Neurodegenerative Human Brain Tissues

Background Human brain aging has received special attention in part because of the elevated risks of neurodegenerative disorders such as Alzheimer's disease in seniors. Recent technological advances enable us to investigate whether similar mechanisms underlie aging and neurodegeneration, by quantifying the similarities and differences in their genome-wide gene expression profiles...

Dense subgraph computation via stochastic search: application to detect transcriptional modules

Motivation: In a tri-partite biological network of transcription factors, their putative target genes, and the tissues in which the target genes are differentially expressed, a tightly inter-connected (dense) subgraph may reveal knowledge about tissue specific transcription regulation mediated by a specific set of transcription factors—a tissue-specific transcriptional module...

Correcting population stratification in genetic association studies using a phylogenetic approach

Motivation: The rapid development of genotyping technology and extensive cataloguing of single nucleotide polymorphisms (SNPs) across the human genome have made genetic association studies the mainstream for gene mapping of complex human diseases. For many diseases, the most practical approach is the population-based design with unrelated individuals. Although having the...

Altered gene expression in the Werner and Bloom syndromes is associated with sequences having G-quadruplex forming potential

The human Werner and Bloom syndromes (WS and BS) are caused by deficiencies in the WRN and BLM RecQ helicases, respectively. WRN, BLM and their Saccharomyces cerevisiae homologue Sgs1, are particularly active in vitro in unwinding G-quadruplex DNA (G4-DNA), a family of non-canonical nucleic acid structures formed by certain G-rich sequences. Recently, mRNA levels from loci...

TREMOR—a tool for retrieving transcriptional modules by incorporating motif covariance

Larry N. Singh 0 Li-San Wang 0 Sridhar Hannenhalli 0 0 Penn Center for Bioinformatics, University of Pennsylvania , Philadelphia, PA 19104, USA A transcriptional module (TM) is a collection of

The Cobweb of Life Revealed by Genome-Scale Estimates of Horizontal Gene Transfer

With the availability of increasing amounts of genomic sequences, it is becoming clear that genomes experience horizontal transfer and incorporation of genetic information. However, to what extent such horizontal gene transfer (HGT) affects the core genealogical history of organisms remains controversial. Based on initial analyses of complete genomic sequences, HGT has been...