Abstract

As models of sequence evolution become more and more complicated, many criteria for model selection have been proposed, and tools are available to select the best model for an alignment under a particular criterion. However, in many instances the selected model fails to explain the data adequately as reflected by large deviations between observed pattern frequencies and the corresponding expectation. We present MISFITS, an approach to evaluate the goodness of fit (http://www.cibiv.at/software/misfits). MISFITS introduces a minimum number of "extra substitutions" on the inferred tree to provide a biologically motivated explanation why the alignment may deviate from expectation. These extra substitutions plus the evolutionary model then fully explain the alignment. We illustrate the method on several examples and then give a survey about the goodness of fit of the selected models to the alignments in the PANDIT database.

Keywords

Goodness of fitModel selectionPhylogenetic treeSelection (genetic algorithm)BiologyMultiple sequence alignmentSequence (biology)Data miningComputer scienceArtificial intelligenceMachine learningSequence alignment

Affiliated Institutions

Related Publications

Publication Info

Year
2010
Type
article
Volume
28
Issue
1
Pages
143-152
Citations
17
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

17
OpenAlex

Cite This

Minh Anh Nguyen, Steffen Klaere, Arndt von Haeseler (2010). MISFITS: Evaluating the Goodness of Fit between a Phylogenetic Model and an Alignment. Molecular Biology and Evolution , 28 (1) , 143-152. https://doi.org/10.1093/molbev/msq180

Identifiers

DOI
10.1093/molbev/msq180