Investigating Different Duplication Pattern of Essential Genes in Mouse and Human
March
Investigating Different Duplication Pattern of Essential Genes in Mouse and Human
Debarun Acharya 0
Dola Mukherjee 0
Soumita Podder 0
Tapash C. Ghosh 0
0 Bioinformatics Centre, Bose Institute , Kolkata, West Bengal , India
Gene duplication is one of the major driving forces shaping genome and organism evolution and thought to be itself regulated by some intrinsic properties of the gene. Comparing the essential genes among mouse and human, we observed that the essential genes avoid duplication in mouse while prefer to remain duplicated in humans. In this study, we wanted to explore the reasons behind such differences in gene essentiality by cross-species comparison of human and mouse. Moreover, we examined essential genes that are duplicated in humans are functionally more redundant than that in mouse. The proportion of paralog pseudogenization of essential genes is higher in mouse than that of humans. These duplicates of essential genes are under stringent dosage regulation in human than in mouse. We also observed slower evolutionary rate in the paralogs of human essential genes than the mouse counterpart. Together, these results clearly indicate that human essential genes are retained as duplicates to serve as backed up copies that may shield themselves from harmful mutations.
-
Data Availability Statement: All the data used in the
experiments are freely available in the paper, the
supplemental files, and in well-recognized public
repositories. All "gene essentiality, gene duplication,
developmental genes and phyletic age data of mouse
and human" are available from the Online Gene
Essentiality Database (URL- http://ogeedb.embl.de).
The dataset is also provided as a supplemental file
S1_Dataset.xlsx. The duplicate pairs of mouse and
human genes under study is provided in
supplemental file S2_Dataset.xlsx. All Gene Ontology
Annotation for mouse and human are available from
the Ensembl biomart interface (Release 71)
(URLhttp://www.ensembl.org/biomart/martview). Gene
biotype data for Pseudogenization for mouse and
human are available from the Ensembl biomart
interface (Release 71) (URL-http://www.ensembl.org/
Gene duplication was thought to be one of the major driving factors stimulating genome and
organism evolution [14], as it provides raw genetic materials for structural and functional
modification and at the same time conserves the parental function. Although, gene duplication
is not always beneficial, and most duplicates become subsequently inactivated or
pseudogenized in the genome [4], it may have many implications in an organisms life. For example, the
duplicates may be maintained in the genome for its immediate benefit to the organism, like
increased gene dosage [5] or serve as backup copies to restore the function if the original one
becomes deleted [6,7]. Apart from this, the duplicates may undergo modifications to take up
novel functions, i.e. neofunctionalization [4], or they may share their function after
complementary degenerative mutations, i.e. subfunctionalization [8,9]. The pattern of gene
duplication may vary between species and also across different groups of genes within the same
species. Several factors contributing gene duplication has been observed till date in diverse
organisms like protein connectivity and protein interaction network [1012], protein complexity
[13,14], gene retention and sequence divergence [15], dosage balance [16] and nevertheless,
gene essentiality [1719].
biomart/martview). Nonsynonymous nucleotide
substitution per nonsynonymous sites (dN) and
synonymous nucleotide substitution per synonymous
sites (dS) for mouse and human with corresponding
one-to-one rat orthologs are available from the
Ensembl biomart interface (Release 71) (URL-http://
www.ensembl.org/biomart/martview). All micro-RNA
target sites for mouse and human were obtained from
TargetScan Release 6.2 (http://www.targetscan.org).
In the case of any query, the readers may contact Mr.
Debarun Acharya (e-mail: ).
Competing Interests: The authors have declared
that no competing interests exist.
Essential genes are indispensable to an organism and cause severe reduction in its fitness
like sterility or lethality upon deletion [20]. These genes are mainly associated with important
biological functions. However, many expressed genes performing such functions are
considered to be nonessential, as their deletion can be compensated by other genes having similar or
identical functions and expression [21]. Gene duplication is an important mechanism for such
functional redundancy to occur [4]. Now, there may be two kinds of possibilities for essential
genes to prefer or avoid the course of gene duplication. First, essential genes are required to
become duplicated for providing backup copies that could shield themselves from any harmful
mutations; secondly from evolutionary standpoint, essential genes may prefer to stay away
from gene duplication since ectopic recombination and replication driven gene duplication
may increase the chances of mutational load which is not at all acceptable for essential genes
for being the most conserved gene-group [22,23].
Gene essentiality was widely studied across model organisms and shown to bear a complex
relationship with gene duplication [19]. In lower eukaryotes like yeast, a higher proportion of
essential genes were observed in singletons than in duplicates [7]. However, studies with
mouse showed that the proportion of essential genes in duplicates are comparable to that in
singletons [10,18]. Additionally, two follow-up studies with mouse also report that the
proportion of essential genes is higher in singletons than in duplicates [21,24].
Till date, all the studies regarding essential genes were carried in yeast and mouse due to
unavailability of human gene essentiality data. In a previous study, researchers attempted to
explore the properties of human orthologs of mouse essential genes [25]. However, considering
such human orthologs as essential may not be accurate [26]. Taking advantage of the Online
Gene Essentiality (OGEE) database that represents a valuable resource of human and mouse
essential genes, we performed a comprehensive analysis comparing duplication pattern of
essential genes in human and mouse. We noticed that in mouse, the essential genes prefer to
remain as singleton whereas the trend is reverse for human, which is unexplored so far. We have
also explored the underlying reasons and the benefits of maintaining essential genes as
duplicates in humans.
Materials and Methods
Gene Essentiality and Gene Duplication
Gene essentiality and duplication of human (Homo sapiens) and mouse (Mus musculus) were
obtained from the Online Gene Essentiality (OGEE) database (http://ogeedb.embl.de) [27] (S1
Dataset). The paralog lists for human and mouse essential genes were provided by the authors
of OGEE database [27] (S2 Dataset).
The developmental genes for mouse and human were obtained from Online Gene Essentialit (...truncated)