Abstract

The correct interpretation of any phylogenetic tree is dependent on that tree being correctly rooted. We present STRIDE, a fast, effective, and outgroup-free method for identification of gene duplication events and species tree root inference in large-scale molecular phylogenetic analyses. STRIDE identifies sets of well-supported in-group gene duplication events from a set of unrooted gene trees, and analyses these events to infer a probability distribution over an unrooted species tree for the location of its root. We show that STRIDE correctly identifies the root of the species tree in multiple large-scale molecular phylogenetic data sets spanning a wide range of timescales and taxonomic groups. We demonstrate that the novel probability model implemented in STRIDE can accurately represent the ambiguity in species tree root assignment for data sets where information is limited. Furthermore, application of STRIDE to outgroup-free inference of the origin of the eukaryotic tree resulted in a root probability distribution that provides additional support for leading hypotheses for the origin of the eukaryotes.

Keywords

BiologyPhylogenetic treeSTRIDEGene duplicationTree (set theory)PhylogeneticsOutgroupEvolutionary biologyInferencePhylogenetic networkRoot (linguistics)PhylogenomicsRange (aeronautics)GeneComputational biologyGeneticsCombinatoricsArtificial intelligenceMathematicsPaleontologyCladeComputer science

Affiliated Institutions

Related Publications

Publication Info

Year
2017
Type
article
Volume
34
Issue
12
Pages
3267-3278
Citations
309
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

309
OpenAlex

Cite This

David Emms, Steven Kelly (2017). STRIDE: Species Tree Root Inference from Gene Duplication Events. Molecular Biology and Evolution , 34 (12) , 3267-3278. https://doi.org/10.1093/molbev/msx259

Identifiers

DOI
10.1093/molbev/msx259