Analysis of codon usage and nucleotide composition bias in polioviruses
Jie Zhang
0
Meng Wang
0
Wen-qian Liu
0
Jian-hua Zhou
0
Hao-tai Chen
0
Li-na Ma
0
Yao-zhong Ding
0
Yuan-xing Gu
0
Yong-sheng Liu
0
0
State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences
,
Lanzhou, 730046 Gansu
,
China
Background: Poliovirus, the causative agent of poliomyelitis, is a human enterovirus and a member of the family of Picornaviridae and among the most rapidly evolving viruses known. Analysis of codon usage can reveal much about the molecular evolution of the viruses. However, little information about synonymous codon usage pattern of polioviruses genome has been acquired to date. Methods: The relative synonymous codon usage (RSCU) values, effective number of codon (ENC) values, nucleotide contents and dinucleotides were investigated and a comparative analysis of codon usage pattern for open reading frames (ORFs) among 48 polioviruses isolates including 31 of genotype 1, 13 of genotype 2 and 4 of genotype 3. Results: The result shows that the overall extent of codon usage bias in poliovirus samples is low (mean ENC = 53.754 > 40). The general correlation between base composition and codon usage bias suggests that mutational pressure rather than natural selection is the main factor that determines the codon usage bias in those polioviruses. Depending on the RSCU data, it was found that there was a significant variation in bias of codon usage among three genotypes. Geographic factor also has some effect on the codon usage pattern (exists in the genotype-1 of polioviruses). No significant effect in gene length or vaccine derived polioviruses (DVPVs), wild viruses and live attenuated virus was observed on the variations of synonymous codon usage in the virus genes. The relative abundance of dinucleotide (CpG) in the ORFs of polioviruses are far below expected values especially in DVPVs and attenuated virus of polioviruses genotype 1. Conclusion: The information from this study may not only have theoretical value in understanding poliovirus evolution, especially for DVPVs genotype 1, but also have potential value for the development of poliovirus vaccines.
-
Background
When molecular sequence data started to be
accumulated nearly 20 years ago, it was noted that synonymous
codons are not used equally in different genomes, even
in different genes of the same genome [1-3]. As an
important evolutionary phenomenon, it is well known
that synonymous codon usage bias exists in a wide
range of biological systems from prokaryotes to
eukaryotes [4,5]. Codon usage analysis has been applied to
prokaryote and eukaryote, such as Escherichia coli,
Bacillus subtilis, Saccharomyces cerevisiae,
Caenorhabditis elegans and human beings [6-8]. These observed
patterns in synonymous codon usage varied among genes
within a genome, and among genomes. The codon
usage is attributable to the equilibrium between natural
selection and mutation pressure [9,10]. Recent studies of
viral codon usage has shown that mutation bias may be
a more important factor than natural selection in
determining codon usage bias of some viruses, such as
Picornaviridae, Pestivirus, plant viruses, and vertebrate DNA
viruses [9,11-13]. Meanwhile, recent report also showed
that the G+C compositional constraint is the main
factor that determines the codon usage bias in iridovirus
genomes[11,14]. Analysis of codon usage can reveal
much about the molecular evolution or individual genes
of the viruses.
Polioviruses belong to the family Picornaviridae and
are classified as human enterovirus C (HEV-C) species
in the genus Enterovirus according to the current
taxonomy [15,16]. Polioviruses can be divided into three
different genotypes: 1, 2 and 3. The genome of each
genotypes contains a single positive-stranded RNA with
a size of approximately 6 kb consisting of a single large
open reading frame (ORF) flanked by 5 and 3
untranslated region [17].
As we known, the Sabin oral poliovaccine (OPV) was
among the best known viral vaccines [18]. It has saved
the lives and health of innumerable people, in particular
children. However, poliovirus is highly genetically
variable. OPV viruses may undergo transformation into
circulating highly diverged VDPV, exhibiting properties
hardly distinguishable from those of wild polioviruses
[19]. So far, little information about synonymous codon
usage pattern of polioviruses genome has been acquired
to date. To our knowledge, this is the first report of the
codon usage analysis on polioviruses (including wild
strains, attenuated live vaccine strains and VDPV
strains). In this study, we analyzed the codon usage data
and base composition of 48 available representative
complete ORFs of poliovirus to obtain some clues to the
features of genetic evolution of the virus.
Methods
Sequence data
A total of 48 poliovirus genomes were used in this study
(Table 1). The serial number (SN), genotype, length
value, isolated region, GenBank accession numbers, and
(...truncated)