Phylemon: a suite of web tools for molecular evolution, phylogenetics and phylogenomics
Joaqun T arraga
0
1
Ignacio Medina
1
Leonardo Arbiza
1
Jaime Huerta-Cepas
0
1
Toni Gabaldo n
1
Joaqun Dopazo
0
1
Herna n Dopazo
1
0
Functional Genomics Node
, INB, CIPF, Valencia 46013,
Spain
1
Bioinformatics Department, Centro de Investigacio n Prncipe Felipe (CIPF)
Phylemon is an online platform for phylogenetic and evolutionary analyses of molecular sequence data. It has been developed as a web server that integrates a suite of different tools selected among the most popular stand-alone programs in phylogenetic and evolutionary analysis. It has been conceived as a natural response to the increasing demand of data analysis of many experimental scientists wishing to add a molecular evolution and phylogenetics insight into their research. Tools included in Phylemon cover a wide yet selected range of programs: from the most basic for multiple sequence alignment to elaborate statistical methods of phylogenetic reconstruction including methods for evolutionary rates analyses and molecular adaptation. Phylemon has several features that differentiates it from other resources: (i) It offers an integrated environment that enables the direct concatenation of evolutionary analyses, the storage of results and handles required data format conversions, (ii) Once an outfile is produced, Phylemon suggests the next possible analyses, thus guiding the user and facilitating the integration of multi-step analyses, and (iii) users can define and save complete pipelines for specific phylogenetic analysis to be automatically used on many genes in subsequent sessions or multiple genes in a single session (phylogenomics). The Phylemon web server is available at http://phylemon.bioinfo.cipf.es.
-
Phylogenetic and evolutionary analyses of sequences are
among the most often used methodologies in laboratories
working in functional, comparative and structural
genomics (1). Since 1980, when the first version of the
PHYLogeny Inference Package (PHYLIP) (2) was
introduced by Felsenstein, a high number of programs for
phylogenetic inference have been developed. Currently,
PHYLIP(2), PAUP (3), MEGA(1), PhyML(4), PAML(5)
and MrBayes(6) are well-known programs that are used
by thousands of users around the world. Other more
specific programs, designed to test evolutionary
hypotheses for model selection, tree topology, molecular clock or
adaptation, are less popular among common users, but
they are, nevertheless, of great interest for users familiar
with evolutionary enquiries. Currently, the most
comprehensive list of phylogenetic resources can be found at the
University of Washington in Seattle (http://evolution.
genetics.washington.edu/phylip/software.html), which
listed 292 phylogeny packages and 38 web servers,
by 2003.
Web servers for phylogenetic and evolutionary analyses
provide a direct means for addressing several evolutionary
questions, ranging from the computation of a multiple
alignment and a neighbor-joining tree using ClustalW
program (7) (http://www.ebi.ac.uk/clustalw/), to the more
sophisticate analysis of molecular adaptation for
detection of positively selected sites in DNA sequences (using
methods as those available in the HYPHY package (8)
(http://www.datamonkey.org/)). Many such servers run a
single tool or program whereas others bring together
many of the most popular programs of phylogenetic
reconstruction (e.g. see http://bioweb.pasteur.fr/seqanal/
phylogeny/intro-uk.html).
Despite this diversity there is, so far, no single
integrated web server that provides a common framework
to run the most frequent analyses on DNA and protein
sequences from a phylogenetic and evolutionary
perspective. Non-expert users are then often overwhelmed by the
variety of servers, formats and options available and by
the difficulty of concatenating analyses performed on
different servers. The main objective of Phylemon is to
fulfil this need by providing users with the possibility of
finding all the necessary applications in a single integrated
web framework that guides them throughout the whole
evolutionary analysis.
OUTLINE OF THE PROGRAM
Phylemon is a web server that integrates a selected suite of
more than 20 different tools from the most popular
standalone programs of phylogenetic and evolutionary analysis
(Figure 1A).
Three features characterize all tools integrated in
Phylemon: (1) tools have available examples in order to
familiarize users with the correct input data and expected
results, (2) input formats (preferentially FASTA or
PHYLIP) are automatically transformed in order to
move among alternative tools and (3) all the input and
output result files can be saved in default or user-defined
projects (folders).
Phylemon can be accessed by anonymous login or by
registered users. The only difference between these choices
is that registered users, from whom only an e-mail is
required, can store project results and use them at a later
time for further analysis (Figure 1G).
PHYLOGENETIC PROGRAMS
Phylemon runs distance-based methods, maximum
parsimony analyses and statistical methods of phylogenetic
reconstruction. Distances and parsimony methods for
DNA or protein sequence data are provided by the most
often used algorithms of the PHYLIP package (2) v3.65:
DnaDist, ProtDist, DnaPars and ProtPars, respectively.
Pairwise distance matrices can be represented in a
phylogenetic tree using the neighbor-joining (NJ)
algorithm (Neighbor) or applying a least square (LS) method
or a minimum evolution (ME) criterium (Fitch program).
In order to obtain trees with statistical support on internal
nodes, a re-sampling method (i.e. bootstrap option
included in the Seqboot algorithm) and the corresponding
trees summarizing algorithm (i.e. majority rule tree using
Consense program) of PHYLIP are included.
Basic maximum likelihood (ML) analyses of DNA and
protein sequence data are provided with the DnaML and
ProML algorithms of the PHYLIP package in Phylemon.
When a more sophisticated ML analysis is required users
can run PhyML version aLRT (9,10) or TREE-PUZZLE
v5.2 (11). Major differences between these ML programs
are: (1) PhyML is faster than any other ML algorithm of
phylogenetic reconstruction, (2) TREE-PUZZLE uses a
quartet-puzzling method instead the more classical
heuristic searches for tree searching, (3) TREE-PUZZLE
reports reliability values while the PhyML method reports
Felsensteins bootstrap values and aLRT-related statistics
branch support (9), (4) TREE-PUZZLE can quantify
the amount of the phylogenetic signal contained in a
data set (the probability of the data producing a tree-like
phylogenetic representation) through the
likelihoodmapping method (12) and (5) TREE-PUZZLE computes
ML pairwise distances that can easily be represented in an
NJ/LS/ME tree.
Finally, Phylemon runs Bayesian phylogenetic analysis
using MrBayes v3.1.12 (6). MrBayes runs in Phylemon
with the same characteristics that users have in Windows
or Linux interfaces. Users can define all the parameters of
MrBayes in a fil (...truncated)