Abstract
Abstract Phylogenetic trees from multiple genes can be obtained in two fundamentally different ways. In one, gene sequences are concatenated into a super‐gene alignment, which is then analyzed to generate the species tree. In the other, phylogenies are inferred separately from each gene, and a consensus of these gene phylogenies is used to represent the species tree. Here, we have compared these two approaches by means of computer simulation, using 448 parameter sets, including evolutionary rate, sequence length, base composition, and transition/transversion rate bias. In these simulations, we emphasized a worst‐case scenario analysis in which 100 replicate datasets for each evolutionary parameter set (gene) were generated, and the replicate dataset that produced a tree topology showing the largest number of phylogenetic errors was selected to represent that parameter set. Both randomly selected and worst‐case replicates were utilized to compare the consensus and concatenation approaches primarily using the neighbor‐joining (NJ) method. We find that the concatenation approach yields more accurate trees, even when the sequences concatenated have evolved with very different substitution patterns and no attempts are made to accommodate these differences while inferring phylogenies. These results appear to hold true for parsimony and likelihood methods as well. The concatenation approach shows >95% accuracy with only 10 genes. However, this gain in accuracy is sometimes accompanied by reinforcement of certain systematic biases, resulting in spuriously high bootstrap support for incorrect partitions, whether we employ site, gene, or a combined bootstrap resampling approach. Therefore, it will be prudent to report the number of individual genes supporting an inferred clade in the concatenated sequence tree, in addition to the bootstrap support. J. Exp. Zool.(Mol. Dev. Evol.) 304B:000–000, 2005. © 2005 Wiley‐Liss, Inc.
Keywords
Affiliated Institutions
Related Publications
TCS: a computer program to estimate gene genealogies
Phylogenies are extremely useful tools, not only for establishing genealogical relationships among a group of organisms or their parts (e.g. genes), but also for a variety of re...
Hierarchical Phylogenetic Models for Analyzing Multipartite Sequence Data
Debate exists over how to incorporate information from multipartite sequence data in phylogenetic analyses. Strict combined-data approaches argue for concatenation of all partit...
CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP
The recently-developed statistical method known as the "bootstrap" can be used to place confidence intervals on phylogenies. It involves resampling points from one's own data, w...
Bayesian Inference of Species Trees from Multilocus Data
Until recently, it has been common practice for a phylogenetic analysis to use a single gene sequence from a single individual organism as a proxy for an entire species. With te...
Phylogenetic divergence in leatherside chub (<i>Gila copei</i>) inferred from mitochondrial cytochrome <i>b</i> sequences
Abstract We examined intra‐specific phylogenetic relationships in leatherside chub, Gila copei . The complete mitochondrial (mt) cytochrome b gene (1140 bp) was sequenced for 30...
Publication Info
- Year
- 2004
- Type
- article
- Volume
- 304B
- Issue
- 1
- Pages
- 64-74
- Citations
- 457
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1002/jez.b.21026