Advanced search    

Search: authors:"Kei-Hoi Cheung"

26 papers found.
Use AND, OR, NOT, +word, -word, "long phrase", (parentheses) to fine-tune your search.

Predicting urinary tract infections in the emergency department with machine learning

Table. UTI diagnoses. ICD codes for UTI. (DOCX) Author Contributions Conceptualization: R. Andrew Taylor, Christopher L. Moore, Kei-Hoi Cheung, Cynthia Brandt. Formal analysis: R. Andrew Taylor. 12 ... / 15 Methodology: R. Andrew Taylor, Kei-Hoi Cheung, Cynthia Brandt. Supervision: Christopher L. Moore, Cynthia Brandt. Visualization: R. Andrew Taylor. Writing ± original draft: R. Andrew Taylor, Kei

A Statistical Framework to Predict Functional Non-Coding Regions in the Human Genome Through Integrated Analysis of Annotation Data

, USAYuwei Cheng, Kei-Hoi Cheung & Hongyu ZhaoYale Center for Medical Informatics, Yale School of Medicine, New Haven, CT, USAKei-Hoi CheungDepartment of Emergency Medicine, Yale School of Medicine, New Haven ... • Google ScholarSearch for Kei-Hoi Cheung in:Nature Research journals • PubMed • Google ScholarSearch for Hongyu Zhao in:Nature Research journals • PubMed • Google Scholar Contributions Q.L. and H.Z

A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations

Background It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to recent pharmacological successes in treating these driver mutations and their...

A spatial simulation approach to account for protein structure when identifying non-random somatic mutations

Background Current research suggests that a small set of “driver” mutations are responsible for tumorigenesis while a larger body of “passenger” mutations occur in the tumor but do not progress the disease. Due to recent pharmacological successes in treating cancers caused by driver mutations, a variety of methodologies that attempt to identify such mutations have been developed...

Utilizing protein structure to identify non-random somatic mutations

Background Human cancer is caused by the accumulation of somatic mutations in tumor suppressors and oncogenes within the genome. In the case of oncogenes, recent theory suggests that there are only a few key “driver” mutations responsible for tumorigenesis. As there have been significant pharmacological successes in developing drugs that treat cancers that carry these driver...

A semantic web framework to integrate cancer omics data with biological knowledge

and edited the paper. Kei-Hoi Cheung provided general design guidance, helped formulate the biomedical queries and edited the paper. Michael Krauthammer provided research and design guidance, helped

Using semantic web rules to reason on an ontology of pseudogenes

Motivation: Recent years have seen the development of a wide range of biomedical ontologies. Notable among these is Sequence Ontology (SO) which offers a rich hierarchy of terms and relationships that can be used to annotate genomic data. Well-designed formal ontologies allow data to be reasoned upon in a consistent and logically sound way and can lead to the discovery of new...

Bringing Web 2.0 to bioinformatics

Zhang Zhang Kei-Hoi Cheung Jeffrey P. Townsend Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have

Web GIS in practice VI: a demo playlist of geo-mashups for public health neogeographers

'Mashup' was originally used to describe the mixing together of musical tracks to create a new piece of music. The term now refers to Web sites or services that weave data from different sources into a new data source or service. Using a musical metaphor that builds on the origin of the word 'mashup', this paper presents a demonstration "playlist" of four geo-mashup vignettes...

A journey to Semantic Web query federation in the life sciences

Background As interest in adopting the Semantic Web in the biomedical domain continues to grow, Semantic Web technology has been evolving and maturing. A variety of technological approaches including triplestore technologies, SPARQL endpoints, Linked Data, and Vocabulary of Interlinked Datasets have emerged in recent years. In addition to the data warehouse construction, these...

Handling multiple testing while interpreting microarrays with the Gene Ontology Database

Background The development of software tools that analyze microarray data in the context of genetic knowledgebases is being pursued by multiple research groups using different methods. A common problem for many of these tools is how to correct for multiple statistical testing since simple corrections are overly conservative and more sophisticated corrections are currently...

Pseudofam: the pseudogene families database

Pseudofam (http://pseudofam.pseudogene.org) is a database of pseudogene families based on the protein families from the Pfam database. It provides resources for analyzing the family structure of pseudogenes including query tools, statistical summaries and sequence alignments. The current version of Pseudofam contains more than 125 000 pseudogenes identified from 10 eukaryotic...

LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics

Background A key abstraction in representing proteomics knowledge is the notion of unique identifiers for individual entities (e.g. proteins) and the massive graph of relationships among them. These relationships are sometimes simple (e.g. synonyms) but are often more complex (e.g. one-to-many relationships in protein family membership). Results We have built a software system...

Approaches to neuroscience data integration

Kei-Hoi Cheung Ernest Lim Matthias Samwald Huajun Chen Luis Marenco Matthew E. Holford Thomas M. Morse Pradeep Mutalik Gordon M. Shepherd Perry L. Miller As the number of neuroscience databases

YeastHub: a semantic web use case for integrating data in the life sciences domain

Motivation: As the semantic web technology is maturing and the need for life sciences data integration over the web is growing, it is important to explore how data integration needs can be addressed by the semantic web. The main problem that we face in data integration is a lack of widely-accepted standards for expressing the syntax and semantics of the data. We address this...

KARMA: a web server application for comparing and annotating heterogeneous microarray platforms

We have developed a universal web server application (KARMA) that allows comparison and annotation of user-defined pairs of microarray platforms based on diverse types of genome annotation data (across different species) collected from multiple sources. The application is an effective tool for diverse microarray platforms, including arrays that are provided by (i) the Keck...

PathMAPA: a tool for displaying gene expression and performing statistical tests on metabolic pathways at multiple levels for Arabidopsis

Background To date, many genomic and pathway-related tools and databases have been developed to analyze microarray data. In published web-based applications to date, however, complex pathways have been displayed with static image files that may not be up-to-date or are time-consuming to rebuild. In addition, gene expression analyses focus on individual probes and genes with...

The TRIPLES database: a community resource for yeast molecular biology

Anuj Kumar Kei-Hoi Cheung Nick Tosches Peter Masiar Yang Liu Perry Miller Michael Snyder TRIPLES is a web-accessible database of TRansposonInsertion Phenotypes, Localization and Expression in

A web services choreography scenario for interoperating bioinformatics applications

Background Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1) the platforms on which the applications run are heterogeneous, 2) their web interface is not machine...

AlzPharm: integration of neurodegeneration data using RDF

Background Neuroscientists often need to access a wide range of data sets distributed over the Internet. These data sets, however, are typically neither integrated nor interoperable, resulting in a barrier to answering complex neuroscience research questions. Domain ontologies can enable the querying heterogeneous data sets, but they are not sufficient for neuroscience since the...