Abstract
We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihood-ratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLRT is based on the idea of the conventional LRT, with the null hypothesis corresponding to the assumption that the inferred branch has length 0. We show that the LRT statistic is asymptotically distributed as a maximum of three random variables drawn from the chi(0)2 + chi(1)2 distribution. The new aLRT of interior branch uses this distribution for significance testing, but the test statistic is approximated in a slightly conservative but practical way as 2(l1- l2), i.e., double the difference between the maximum log-likelihood values corresponding to the best tree and the second best topological arrangement around the branch of interest. Such a test is fast because the log-likelihood value l2 is computed by optimizing only over the branch of interest and the four adjacent branches, whereas other parameters are fixed at their optimal values corresponding to the best ML tree. The performance of the new test was studied on simulated 4-, 12-, and 100-taxon data sets with sequences of different lengths. The aLRT is shown to be accurate, powerful, and robust to certain violations of model assumptions. The aLRT is implemented within the algorithm used by the recent fast maximum likelihood tree estimation program PHYML (Guindon and Gascuel, 2003).
Keywords
Affiliated Institutions
Related Publications
New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0
PhyML is a phylogeny software based on the maximum-likelihood principle. Early PhyML versions used a fast algorithm performing nearest neighbor interchanges to improve a reasona...
A Modified Likelihood Ratio Test for Homogeneity in Finite Mixture Models
Summary Testing for homogeneity in finite mixture models has been investigated by many researchers. The asymptotic null distribution of the likelihood ratio test (LRT) is very c...
jModelTest: Phylogenetic Model Averaging
jModelTest is a new program for the statistical selection of models of nucleotide substitution based on "Phyml" (Guindon and Gascuel 2003. A simple, fast, and accurate algorithm...
Trust Region Newton Method for Logistic Regression
Large-scale logistic regression arises in many applications such as document classification and natural language processing. In this paper, we apply a trust region Newton method...
On Bootstrapping the Likelihood Ratio Test Stastistic for the Number of Components in a Normal Mixture
An important but difficult problem in practice is assessing the number of components g in a mixture. An obvious way of proceeding is to use the likelihood ratio test statistic λ...
Publication Info
- Year
- 2006
- Type
- article
- Volume
- 55
- Issue
- 4
- Pages
- 539-552
- Citations
- 2638
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1080/10635150600755453