Advanced search    

Search: authors:"Guohui Lin"

24 papers found.
Use AND, OR, NOT, +word, -word, "long phrase", (parentheses) to fine-tune your search.

A (1.4 + epsilon)-Approximation Algorithm for the 2-Max-Duo Problem

The maximum duo-preservation string mapping (Max-Duo) problem is the complement of the well studied minimum common string partition (MCSP) problem, both of which have applications in many fields including text compression and bioinformatics. k-Max-Duo is the restricted version of Max-Duo, where every letter of the alphabet occurs at most k times in each of the strings, which is...

Single Machine Scheduling with Job-Dependent Machine Deterioration

We consider the single machine scheduling problem with job-dependent machine deterioration. In the problem, we are given a single machine with an initial non-negative maintenance level, and a set of jobs each with a non-preemptive processing time and a machine deterioration. Such a machine deterioration quantifies the decrement in the machine maintenance level after processing...

Genotype Imputation Methods and Their Effects on Genomic Predictions in Cattle

In this study, we reviewed six imputation methods (Impute 2, FImpute 2.2, Beagle 4.1, Beagle 3.3.2, MaCH, and Bimbam) and evaluated the accuracy of imputation from simulated 6K bovine SNPs to 50K SNPs with 1800 beef cattle from two purebred and four crossbred populations and the impact of imputed genotypes on performance of genomic predictions for residual feed intake (RFI) in...

Whole genome SNP genotype piecemeal imputation

Background Despite ongoing reductions in the cost of sequencing technologies, whole genome SNP genotype imputation is often used as an alternative for obtaining abundant SNP genotypes for genome wide association studies. Several existing genotype imputation methods can be efficient for this purpose, while achieving various levels of imputation accuracy. Recent empirical results...

Isomorphism and similarity for 2-generation pedigrees

We consider the emerging problem of comparing the similarity between (unlabeled) pedigrees. More specifically, we focus on the simplest pedigrees, namely, the 2-generation pedigrees. We show that the isomorphism testing for two 2-generation pedigrees is GI-hard. If the 2-generation pedigrees are monogamous (i.e., each individual at level-1 can mate with exactly one partner) then...

Preface

Hangzhou Dianzi University , Xiasha Higher Education Zone, Hangzhou, Zhejiang 310018 , China 2 Guangting Chen , Guohui Lin, and Zhiyi Tan 3 Z. Tan Department of Mathematics, Zhejiang University , Hangzhou

Parallel Metabolomic Profiling of Cerebrospinal Fluid and Serum for Identifying Biomarkers of Injury Severity after Acute Human Spinal Cord Injury

journals • PubMed • Google ScholarSearch for Yining Wang in:Nature Research journals • PubMed • Google ScholarSearch for Guohui Lin in:Nature Research journals • PubMed • Google ScholarSearch for Sean

Fast accurate missing SNP genotype local imputation

Single nucleotide polymorphism (SNP) genotyping assays normally give rise to certain percents of no-calls; the problem becomes severe when the target organisms, such as cattle, do not have a high resolution genomic sequence. Missing SNP genotypes, when related to target traits, would confound downstream data analyses such as genome-wide association studies (GWAS). Existing...

Fast accurate missing SNP genotype local imputation

BackgroundSingle nucleotide polymorphism (SNP) genotyping assays normally give rise to certain percents of no-calls; the problem becomes severe when the target organisms, such as cattle, do not have a high resolution genomic sequence. Missing SNP genotypes, when related to target traits, would confound downstream data analyses such as genome-wide association studies (GWAS...

ComPhy: prokaryotic composite distance phylogenies inferred from whole-genome gene sets

Background With the increasing availability of whole genome sequences, it is becoming more and more important to use complete genome sequences for inferring species phylogenies. We developed a new tool ComPhy, 'Composite Distance Phylogeny', based on a composite distance matrix calculated from the comparison of complete gene sets between genome pairs to produce a prokaryotic...

Selecting dissimilar genes for multi-class classification, an application in cancer subtyping

Background Gene expression microarray is a powerful technology for genetic profiling diseases and their associated treatments. Such a process involves a key step of biomarker identification, which are expected to be closely related to the disease. A most important task of these identified genes is that they can be used to construct a classifier which can effectively diagnose...

Protein contact order prediction from primary sequences

Background Contact order is a topological descriptor that has been shown to be correlated with several interesting protein properties such as protein folding rates and protein transition state placements. Contact order has also been used to select for viable protein folds from ab initio protein structure prediction programs. For proteins of known three-dimensional structure...

Identification of linked regions using high-density SNP genotype data in linkage analysis

Motivation: With the knowledge of large number of SNPs in human genome and the fast development in high-throughput genotyping technologies, identification of linked regions in linkage analysis through allele sharing status determination will play an ever important role, while consideration of recombination fractions becomes unnecessary. Results: In this study, we have developed a...

A stable gene selection in microarray data analysis

Background Microarray data analysis is notorious for involving a huge number of genes compared to a relatively small number of samples. Gene selection is to detect the most significantly differentially expressed genes under different conditions, and it has been a central research focus. In general, a better gene selection method can improve the performance of classification...

CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data

CS23D (chemical shift to 3D structure) is a web server for rapidly generating accurate 3D protein structures using only assigned nuclear magnetic resonance (NMR) chemical shifts and sequence data as input. Unlike conventional NMR methods, CS23D requires no NOE and/or J-coupling data to perform its calculations. CS23D accepts chemical shift files in either SHIFTY or BMRB formats...

Identifying a few foot-and-mouth disease virus signature nucleotide strings for computational genotyping

Background Serotypes of the Foot-and-Mouth disease viruses (FMDVs) were generally determined by biological experiments. The computational genotyping is not well studied even with the availability of whole viral genomes, due to uneven evolution among genes as well as frequent genetic recombination. Naively using sequence comparison for genotyping is only able to achieve a limited...

Most parsimonious haplotype allele sharing determination

Background The "common disease – common variant" hypothesis and genome-wide association studies have achieved numerous successes in the last three years, particularly in genetic mapping in human diseases. Nevertheless, the power of the association study methods are still low, in particular on quantitative traits, and the description of the full allelic spectrum is deemed still...

Nucleotide composition string selection in HIV-1 subtyping using whole genomes

Motivation: The availability of the whole genomic sequences of HIV-1 viruses provides an excellent resource for studying the HIV-1 phylogenies using all the genetic materials. However, such huge volumes of data create computational challenges in both memory consumption and CPU usage. Results: We propose the complete composition vector representation for an HIV-1 strain, and a...