Database: The Journal of Biological Databases and Curation

<a href="http://www.oxfordjournals.org/our_journals/databa/about.html">http://www.oxfordjournals.org/our_journals/databa/about.html</a>

List of Papers (Total 687)

Mining biomedical images towards valuable information retrieval in biomedical and life sciences

Jan 2016 | Zeeshan Ahmed, Saman Zeeshan, Thomas Dandekar

Biomedical images are helpful sources for the scientists and practitioners in drawing significant hypotheses, exemplifying approaches and describing experimental results in published biomedical literature. In last decades, there has been an enormous increase in the amount of heterogeneous biomedical image production and publication, which results in a need for bioimaging...

Jan 2016
Zeeshan Ahmed, Saman Zeeshan, Thomas Dandekar

PIPE: a protein–protein interaction passage extraction module for BioCreative challenge

Jan 2016 | Yung-Chun Chang, Chun-Han Chu, Yu-Chen Su, et al.

Identifying the interactions between proteins mentioned in biomedical literatures is one of the frequently discussed topics of text mining in the life science field. In this article, we propose PIPE, an interaction pattern generation module used in the Collaborative Biocurator Assistant Task at BioCreative V (http://www.biocreative.org/) to capture frequent protein-protein...

Jan 2016
Yung-Chun Chang, Chun-Han Chu, Yu-Chen Su, et al.

Onco-Regulon: an integrated database and software suite for site specific targeting of transcription factors of cancer genes

Jan 2016 | Navneet Tomar, Akhilesh Mishra, Nirotpal Mrinal, et al.

Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an...

Jan 2016
Navneet Tomar, Akhilesh Mishra, Nirotpal Mrinal, et al.

Ricebase: a breeding and genetics platform for rice, integrating individual molecular markers, pedigrees and whole-genome-based data

Jan 2016 | J. D. Edwards, A. M. Baldo, L. A. Mueller

Ricebase (http://ricebase.org) is an integrative genomic database for rice (Oryza sativa) with an emphasis on combining datasets in a way that maintains the key links between past and current genetic studies. Ricebase includes DNA sequence data, gene annotations, nucleotide variation data and molecular marker fragment size data. Rice research has benefited from early adoption and...

Jan 2016
J. D. Edwards, A. M. Baldo, L. A. Mueller

BioC viewer: a web-based tool for displaying and merging annotations in BioC

Jan 2016 | Soo-Yong Shin, Sun Kim, W. John Wilbur, et al.

BioC is an XML-based format designed to provide interoperability for text mining tools and manual curation results. A challenge of BioC as a standard format is to align annotations from multiple systems. Ideally, this should not be a major problem if users follow guidelines given by BioC key files. Nevertheless, the misalignment between text and annotations happens quite often...

Jan 2016
Soo-Yong Shin, Sun Kim, W. John Wilbur, et al.

Crowdsourcing and curation: perspectives from biology and natural language processing

Jan 2016 | Lynette Hirschman, Karën Fort, Stéphanie Boué, et al.

Crowdsourcing is increasingly utilized for performing tasks in both natural language processing and biocuration. Although there have been many applications of crowdsourcing in these fields, there have been fewer high-level discussions of the methodology and its applicability to biocuration. This paper explores crowdsourcing for biocuration through several case studies that...

Jan 2016
Lynette Hirschman, Karën Fort, Stéphanie Boué, et al.

Improving the dictionary lookup approach for disease normalization using enhanced dictionary and query expansion

Jan 2016 | Jitendra Jonnagaddala, Toni Rose Jue, Nai-Wen Chang, et al.

The rapidly increasing biomedical literature calls for the need of an automatic approach in the recognition and normalization of disease mentions in order to increase the precision and effectivity of disease based information retrieval. A variety of methods have been proposed to deal with the problem of disease named entity recognition and normalization. Among all the proposed...

Jan 2016
Jitendra Jonnagaddala, Toni Rose Jue, Nai-Wen Chang, et al.

How much does curation cost?

Jan 2016 | Peter D. Karp

NIH administrators have recently expressed concerns about the cost of curation for biological databases. However, they did not articulate the exact costs of curation. Here we calculate the cost of biocuration of articles for the EcoCyc database as $219 per article over a 5-year period. That cost is 6–15% of the cost of open-access publication fees for publishing biomedical...

Jan 2016
Peter D. Karp

MODEM: multi-omics data envelopment and mining in maize

Jan 2016 | Haijun Liu, Fan Wang, Yingjie Xiao, et al.

MODEM is a comprehensive database of maize multidimensional omics data, including genomic, transcriptomic, metabolic and phenotypic information from the cellular to individual plant level. This initial release contains approximately 1.06 M high quality SNPs for 508 diverse inbred lines obtained by combining variations from RNA sequencing on whole kernels (15 days after...

Jan 2016
Haijun Liu, Fan Wang, Yingjie Xiao, et al.

NTTMUNSW BioC modules for recognizing and normalizing species and gene/protein mentions

Jan 2016 | Hong-Jie Dai, Onkar Singh, Jitendra Jonnagaddala, et al.

In recent years, the number of published biomedical articles has increased as researchers have focused on biological domains to investigate the functions of biological objects, such as genes and proteins. However, the ambiguous nature of genes and their products have rendered the literature more complex for readers and curators of molecular interaction databases. To address this...

Jan 2016
Hong-Jie Dai, Onkar Singh, Jitendra Jonnagaddala, et al.

PvTFDB: a Phaseolus vulgaris transcription factors database for expediting functional genomics in legumes

Jan 2016 | Bhawna, V.S. Bonthala, MNV Prasad Gajula

The common bean [Phaseolus vulgaris (L.)] is one of the essential proteinaceous vegetables grown in developing countries. However, its production is challenged by low yields caused by numerous biotic and abiotic stress conditions. Regulatory transcription factors (TFs) symbolize a key component of the genome and are the most significant targets for producing stress tolerant crop...

Jan 2016
Bhawna, V.S. Bonthala, MNV Prasad Gajula

Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction

Jan 2016 | Hoang-Quynh Le, Mai-Vu Tran, Thanh Hai Dang, et al.

The BioCreative V chemical-disease relation (CDR) track was proposed to accelerate the progress of text mining in facilitating integrative understanding of chemicals, diseases and their relations. In this article, we describe an extension of our system (namely UET-CAM) that participated in the BioCreative V CDR. The original UET-CAM system’s performance was ranked fourth among 18...

Jan 2016
Hoang-Quynh Le, Mai-Vu Tran, Thanh Hai Dang, et al.

TMC-SNPdb: an Indian germline variant database derived from whole exome sequences

Jan 2016 | Pawan Upadhyay, Nilesh Gardi, Sanket Desai, et al.

Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it’s absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents several non-European Caucasian populations, posing a limitation in cancer genomic...

Jan 2016
Pawan Upadhyay, Nilesh Gardi, Sanket Desai, et al.

BioCreative V track 4: a shared task for the extraction of causal network information using the Biological Expression Language

Jan 2016 | Fabio Rinaldi, Tilia Renate Ellendorff, Sumit Madan, et al.

Automatic extraction of biological network information is one of the most desired and most complex tasks in biological and medical text mining. Track 4 at BioCreative V attempts to approach this complexity using fragments of large-scale manually curated biological networks, represented in Biological Expression Language (BEL), as training and test data. BEL is an advanced...

Jan 2016
Fabio Rinaldi, Tilia Renate Ellendorff, Sumit Madan, et al.

The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins

Jan 2016 | Andrew D. Rouillard, Gregory W. Gundersen, Nicolas F. Fernandez, et al.

Genomics, epigenomics, transcriptomics, proteomics and metabolomics efforts rapidly generate a plethora of data on the activity and levels of biomolecules within mammalian cells. At the same time, curation projects that organize knowledge from the biomedical literature into online databases are expanding. Hence, there is a wealth of information about genes, proteins and their...

Jan 2016
Andrew D. Rouillard, Gregory W. Gundersen, Nicolas F. Fernandez, et al.

neXtA5: accelerating annotation of articles via automated approaches in neXtProt

Jan 2016 | Luc Mottin, Julien Gobeill, Emilie Pasche, et al.

The rapid increase in the number of published articles poses a challenge for curated databases to remain up-to-date. To help the scientific community and database curators deal with this issue, we have developed an application, neXtA5, which prioritizes the literature for specific curation requirements. Our system, neXtA5, is a curation service composed of three main elements...

Jan 2016
Luc Mottin, Julien Gobeill, Emilie Pasche, et al.

Coreference resolution improves extraction of Biological Expression Language statements from texts

Jan 2016 | Miji Choi, Haibin Liu, William Baumgartner, et al.

We describe a system that automatically extracts biological events from biomedical journal articles, and translates those events into Biological Expression Language (BEL) statements. The system incorporates existing text mining components for coreference resolution, biological event extraction and a previously formally untested strategy for BEL statement generation. Although...

Jan 2016
Miji Choi, Haibin Liu, William Baumgartner, et al.

HPIDB 2.0: a curated database for host–pathogen interactions

Jan 2016 | Mais G. Ammari, Cathy R. Gresham, Fiona M. McCarthy, et al.

Identification and analysis of host–pathogen interactions (HPI) is essential to study infectious diseases. However, HPI data are sparse in existing molecular interaction databases, especially for agricultural host–pathogen systems. Therefore, resources that annotate, predict and display the HPI that underpin infectious diseases are critical for developing novel intervention...

Jan 2016
Mais G. Ammari, Cathy R. Gresham, Fiona M. McCarthy, et al.

SorghumFDB: sorghum functional genomics database with multidimensional network analysis

Jan 2016 | Tian Tian, Qi You, Liwei Zhang, et al.

Sorghum (Sorghum bicolor [L.] Moench) has excellent agronomic traits and biological properties, such as heat and drought-tolerance. It is a C4 grass and potential bioenergy-producing plant, which makes it an important crop worldwide. With the sorghum genome sequence released, it is essential to establish a sorghum functional genomics data mining platform. We collected genomic...

Jan 2016
Tian Tian, Qi You, Liwei Zhang, et al.

The Ensembl gene annotation system

Jan 2016 | Bronwen L. Aken, Sarah Ayling, Daniel Barrell, et al.

The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets. The system is based on the alignment of biological sequences, including cDNAs, proteins and RNA-seq reads, to the target genome in...

Jan 2016
Bronwen L. Aken, Sarah Ayling, Daniel Barrell, et al.

A comprehensive view of the web-resources related to sericulture

Jan 2016 | Deepika Singh, Hasnahana Chetia, Debajyoti Kabiraj, et al.

Recent progress in the field of sequencing and analysis has led to a tremendous spike in data and the development of data science tools. One of the outcomes of this scientific progress is development of numerous databases which are gaining popularity in all disciplines of biology including sericulture. As economically important organism, silkworms are studied extensively for...

Jan 2016
Deepika Singh, Hasnahana Chetia, Debajyoti Kabiraj, et al.

Combining machine learning, crowdsourcing and expert knowledge to detect chemical-induced diseases in text

Jan 2016 | Àlex Bravo, Tong Shu Li, Andrew I. Su, et al.

Drug toxicity is a major concern for both regulatory agencies and the pharmaceutical industry. In this context, text-mining methods for the identification of drug side effects from free text are key for the development of up-to-date knowledge sources on drug adverse reactions. We present a new system for identification of drug side effects from the literature that combines three...

Jan 2016
Àlex Bravo, Tong Shu Li, Andrew I. Su, et al.

Predicting structured metadata from unstructured metadata

Jan 2016 | Lisa Posch, Maryam Panahiazar, Michel Dumontier, et al.

Jan 2016
Lisa Posch, Maryam Panahiazar, Michel Dumontier, et al.

Mining clinical attributes of genomic variants through assisted literature curation in Egas

Jan 2016 | Sérgio Matos, David Campos, Renato Pinho, et al.

The veritable deluge of biological data over recent years has led to the establishment of a considerable number of knowledge resources that compile curated information extracted from the literature and store it in structured form, facilitating its use and exploitation. In this article, we focus on the curation of inherited genetic variants and associated clinical attributes, such...

Jan 2016
Sérgio Matos, David Campos, Renato Pinho, et al.

AuDis: an automatic CRF-enhanced disease normalization in biomedical text

Jan 2016 | Hsin-Chun Lee, Yi-Yu Hsu, Hung-Yu Kao

Diseases play central roles in many areas of biomedical research and healthcare. Consequently, aggregating the disease knowledge and treatment research reports becomes an extremely critical issue, especially in rapid-growth knowledge bases (e.g. PubMed). We therefore developed a system, AuDis, for disease mention recognition and normalization in biomedical texts. Our system...

Jan 2016
Hsin-Chun Lee, Yi-Yu Hsu, Hung-Yu Kao