OGEE v2: an update of the online gene essentiality database with special focus on differentially essential genes in human cancer cell lines
D940–D944 Nucleic Acids Research, 2017, Vol. 45, Database issue
doi: 10.1093/nar/gkw1013
Published online 30 October 2016
OGEE v2: an update of the online gene essentiality
database with special focus on differentially essential
genes in human cancer cell lines
Wei-Hua Chen1,*,† , Guanting Lu2,† , Xiao Chen3 , Xing-Ming Zhao3 and Peer Bork4,5,6,7
1
Received August 23, 2016; Revised October 14, 2016; Editorial Decision October 15, 2016; Accepted October 18, 2016
ABSTRACT
OGEE is an Online GEne Essentiality database. To
enhance our understanding of the essentiality of
genes, in OGEE we collected experimentally tested
essential and non-essential genes, as well as associated gene properties known to contribute to
gene essentiality. We focus on large-scale experiments, and complement our data with text-mining
results. We organized tested genes into data sets
according to their sources, and tagged those with
variable essentiality statuses across data sets as
conditionally essential genes, intending to highlight
the complex interplay between gene functions and
environments/experimental perturbations. Developments since the last public release include increased
numbers of species and gene essentiality data sets,
inclusion of non-coding essential sequences and
genes with intermediate essentiality statuses. In addition, we included 16 essentiality data sets from
cancer cell lines, corresponding to 9 human cancers; with OGEE, users can easily explore the shared
and differentially essential genes within and between
cancer types. These genes, especially those derived
from cell lines that are similar to tumor samples,
could reveal the oncogenic drivers, paralogous gene
expression pattern and chromosomal structure of
the corresponding cancer types, and can be further screened to identify targets for cancer therapy
and/or new drug development. OGEE is freely available at http:// ogee.medgenius.info .
INTRODUCTION
Essential genes are those genes of an organism that are critical for its survival; essential genes are of particular importance because of their theoretical and practical applications
such as studying the robustness of a biological system (1),
defining a minimal genome/organism (2,3) and identifying
effective therapeutic targets in pathogens (4–6) and human
cancers (7–11). In recent years, the technologies used for
gene essentiality studies have been evolving rapidly, ranging from low-throughput single gene knockout experiment
(12,13) to high-throughput mutagenesis (3), RNAi (7,8) and
more recently CRISPR-based genome editing methods (14–
18); recent studies showed that CRISPR technology outperformed other methods (14,19), featuring low noise and
minimal off-target effects (19).
Being essential is not an intrinsic property of a gene;
rather, it is highly dependent on a variety of factors including the function and expression pattern of the gene, the genetic background of the host, the environment and other
settings. For example, genes coding for proteins involved
in the biosynthesis of amino acids, nucleic acids and vitamins are essential for cell survival in minimal media, but
not in rich media where the corresponding metabolites are
supplied (20). In addition, different experimental methods
may generate different results. For example, CRISPR-based
methods could identify more essential genes than siRNAbased methods (21), while cell lines generate lower propor-
* To whom correspondence should be addressed. Tel: +86 2787542127; Fax: +86 2787542527; Email:
†
These authors contributed equally to the paper as first authors.
C The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which
permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact
Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and
Molecular-imaging, Department of Bioinformatics and Systems Biology, College of Life Science and Technology,
Huazhong University of Science and Technology (HUST), 430074 Wuhan, Hubei, China, 2 Department of Blood
Transfusion, Tangdu Hospital, the Fourth Military Medical University, No 1, Xinsi Road, Chanba District, 710000 Xi’an,
China, 3 Department of Computer Science and Technology, Tongji University, Shanghai 201804, China, 4 European
molecular biology laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany, 5 Molecular Medicine
Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69120 Heidelberg, Germany,
6
Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Straße 10, 13125 Berlin, Germany and 7 Department
of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
Nucleic Acids Research, 2017, Vol. 45, Database issue D941
A
Developmental genes
Non-developmental genes
79.04
40.86
0.00
79.04
PE% (proportion of essential genes)
B
Duplicates
Singlets
44.07
53.49
PE% (proportion of essential genes)
50.00
53.49
Figure 1. Screenshots taken from the ‘Analyze’ page. With integrated tools, users can easily analyze the collected data and visualize the results. Shown here
are the proportion of essential genes (PE ) as a function of involvement in development (developmental versus non-developmental genes, panel (A) and
duplication statuses (duplicates versus singlets, panel (B)) in mouse.
tion of essential genes than in vivo if the same multi-cellular
organism is used (22).
Genes with variable essentiality statuses under different
circumstances are referred to ‘conditionally essential genes
(CEGs)’ or ‘differentially essential genes (DEGs)’ (14,22).
CEG is a biologically meaningful and very important concept; e.g. genes that are essential in a cancer cell line but
are non-essential in human tissues can reveal the oncogenic
drivers, paralogous gene expression pattern and chromosomal structure of the corresponding cancer type (14).
In 2012, we introduced OGEE v1 (22) to promote the concept of ‘conditional essentiality’, which had not been widely
adopted by existing essential gene databases at the time, and
to advance our understanding on gene essentiality. We did
so by including not only essential and non-essential genes,
but also associated gene properties that are known to affect
gene essentiality; we provided tools that allow users to compare gene essentiality among different gene groups, or compare properties of essential genes to non-essential genes.
In addition, we organized experimentally tested genes into
data sets according to their sources and tagged those with
variable essentiality statuses across data sets as CEGs.
In this study we introduce an updated version of OGEE.
In this new version we added new species and new data sets;
we added (...truncated)