Genetic diversity and evolution of human metapneumovirus fusion protein over twenty years
Virology Journal
Genetic diversity and evolution of human metapneumovirus fusion protein over twenty years
Chin-Fen Yang 3
Chiaoyin K Wang 3
Sharon J Tollefson 2
Rohith Piyaratna 2
Linda D Lintao 3
Marla Chu 3
Alexis Liem 3
Mary Mark 3
Richard R Spaete 3
James E Crowe Jr 0 1 2
John V Williams 0 1 2
0 Department of Microbiology and Immunology, Vanderbilt University School of Medicine , Nashville, TN , USA
1 Monroe Carell Jr Children's Hospital at Vanderbilt , Nashville, TN , USA
2 Department of Pediatrics, Vanderbilt University School of Medicine , Nashville, TN , USA
3 MedImmune Vaccines, Inc, Mountain View , CA , USA
Background: Human metapneumovirus (HMPV) is an important cause of acute respiratory illness in children. We examined the diversity and molecular evolution of HMPV using 85 full-length F (fusion) gene sequences collected over a 20-year period. Results: The F gene sequences fell into two major groups, each with two subgroups, which exhibited a mean of 96% identity by predicted amino acid sequences. Amino acid identity within and between subgroups was higher than nucleotide identity, suggesting structural or functional constraints on F protein diversity. There was minimal progressive drift over time, and the genetic lineages were stable over the 20-year period. Several canonical amino acid differences discriminated between major subgroups, and polymorphic variations tended to cluster in discrete regions. The estimated rate of mutation was 7.12 10-4 substitutions/site/year and the estimated time to most recent common HMPV ancestor was 97 years (95% likelihood range 66-194 years). Analysis suggested that HMPV diverged from avian metapneumovirus type C (AMPV-C) 269 years ago (95% likelihood range 106-382 years). Conclusion: HMPV F protein remains conserved over decades. HMPV appears to have diverged from AMPV-C fairly recently.
-
Background
Human metapneumovirus (HMPV) is a recently
described respiratory virus in the order Mononegavirales,
family Paramyxoviridae, subfamily Pneumovirinae, genus
Metapneumovirus [1]. HMPV is a leading cause of lower
respiratory infection (LRI) in infants and children
worldwide [2-13]. HMPV is also associated with severe disease
in immunocompromised hosts or persons with
underlying conditions [14-20]. Most reports of HMPV molecular
epidemiology have included only a few seasons, and the
genetic variability of HMPV over decades has not been
determined. Candidate vaccines for HMPV are under
development [21-25], and the fusion (F) protein is the
major antigenic determinant of protection [22,24,26-28]
Therefore, it is critical to understand the potential for
immune escape through virus evolution over time, and
the likelihood that immunity against a particular F
protein included in a vaccine candidate will be broadly
protective.
The virus most closely related genetically to HMPV is
avian metapneumovirus type C (AMPV-C) [1]. AMPV is
an emerging pathogen of poultry that was identified in
1979. Subtypes AMPV-A and AMPV-B circulate in Europe
and Africa, while AMPV-C was discovered in Minnesota
and has been detected in the US and Korea [29,30].
Productive experimental infection of poultry with HMPV has
not been successful, and serological studies have failed to
detect evidence of human infection by AMPV [1]. Recent
data suggest that F protein is responsible for this species
restriction [31]. Thus, HMPV infection of humans may
arise from a relatively recent trans-species transmission
from AMPV-C.
We analyzed full-length F gene sequences from 68 isolates
of HMPV collected over a 20-year period from otherwise
healthy children with respiratory disease and 17
published full-length F gene sequences from other regions of
the world. Our data show that HMPV F is highly
conserved geographically over several decades. Distinct
amino acid changes were present between different
genetic lineages, but these amino acids were conserved
within lineages. Variations that were present clustered in
discrete regions, suggesting antigenic sites possibly driven
by selective immune pressure. However, HMPV F gene
sequences did not display progressive drift over time,
unlike influenza viruses. The mutation rate of HMPV was
similar to that of other RNA viruses, and the time to most
recent common ancestor suggested recent divergence
from AMPV-C.
Results
Comparison of sequence identity between subgroups
Full-length F gene sequences were obtained for 68
Tennessee strains of HMPV and assigned to one of the four
proposed lineages (A1, A2, B1, or B2) based on phylogenetic
analysis, discussed further below [32]. Of the 68 strains
sequenced, 34 (50%) were of the B2 lineage, 18 (26%)
A2, 7 (10%) B1 and 9 (13%) A1 lineage. Sequences
obtained in this study were compared to 17 published
full-length HMPV F gene sequences. The overall mean
nucleotide identity between all 85 isolates was 89%, with
a minimum identity of 83.7% (Table 1). The identity
within major groups was higher, mean 96% (minimum
93.9%) between A1 and A2, and mean 97% (minimum
93.5%) between B1 and B2. The B2 lineage diverged more
from the A lineages than the B1 lineage. B2 mean identity
with A1 and A2 was 86.7% and 89.7%, respectively, while
B1 identity with A1 and A2 was 91.3% and 94.7%,
respectively. Mean nucleotide identity was >97% within all
minor lineages, although the minimum identity for the
B2 isolates was the lowest at 93.5%, showing more
diversity within this lineage.
Amino acid identity was more conserved than nucleotide
identity between and within all groups, with overall
minimum identity of 93.7% and mean identity 96.3%. Amino
acid identity within major groups was 98.7% for A1 and
A2, and 99.3% for B1 and B2. The minimum amino acid
identity between all lineages was approximately 94%; the
greater divergence of the B2 lineage at the nucleotide level
was not represented in the amino acid sequence.
Table 1: Comparison of nucleotide and amino acid identity of full-length human metapneumovirus F genes within or between
subgroups.
Number of sequences
Minimum % nt identity
Minimum % aa identity
nt = nucleotide; aa = amino acid.
Distinct and conserved amino acid changes between
lineages
There were a number of amino acid residues distinct to
each group or subgroup (Table 2). The greatest number of
divergent and subgroup-specific residues was identified in
the F1 domain, between the two heptad repeat (HR)
regions. At several positions all subgroups had either
arginine or lysine but maintained a basic residue: 82, 348,
450, 479 and 518; only position 82 has been shown to be
cleaved during infection [33,34]. Many subgroup-specific
residues were similar biochemically between groups.
Some variations, however, were unexpected, such as the
presence of a proline at position 404 only in B subgroup
viruses. Fourteen cysteine residues were conserved among
all isolates except one Japanese sequence (JPS03.178)
with a reported C292W variation [35]. Three potential
Nglycosylation sites were conserved in (...truncated)