Surprising results on phylogenetic tree building methods based on molecular sequences
Open access
Author
Date
2012-06Type
- Journal Article
ETH Bibliography
yes
Altmetrics
Abstract
Background
We analyze phylogenetic tree building methods from molecular sequences (PTMS). These are methods which base their construction solely on sequences, coding DNA or amino acids.
Results
Our first result is a statistically significant evaluation of 176 PTMSs done by comparing trees derived from 193138 orthologous groups of proteins using a new measure of quality between trees. This new measure, called the Intra measure, is very consistent between different groups of species and strong in the sense that it separates the methods with high confidence.
The second result is the comparison of the trees against trees derived from accepted taxonomies, the Taxon measure. We consider the NCBI taxonomic classification and their derived topologies as the most accepted biological consensus on phylogenies, which are also available in electronic form. The correlation between the two measures is remarkably high, which supports both measures simultaneously.
Conclusions
The big surprise of the evaluation is that the maximum likelihood methods do not score well, minimal evolution distance methods over MSA-induced alignments score consistently better. This comparison also allows us to rank different components of the tree building methods, like MSAs, substitution matrices, ML tree builders, distance methods, etc. It is also clear that there is a difference between Metazoa and the rest, which points out to evolution leaving different molecular traces. We also think that these measures of quality of trees will motivate the design of new PTMSs as it is now easier to evaluate them with certainty. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000055626Publication status
publishedExternal links
Journal / series
BMC BioinformaticsVolume
Pages / Article No.
Publisher
BioMed CentralSubject
Phylogenetic trees; Tree building methods; Maximum likelihood; Distance measures; Multiple sequence alignments; Substitution matrices; Molecular sequencesOrganisational unit
03309 - Gonnet, Gaston
More
Show all metadata
ETH Bibliography
yes
Altmetrics