Abstract
Pyrosequencing of PCR-amplified fragments that target variable regions within the 16S rRNA gene has quickly become a powerful method for analyzing the membership and structure of microbial communities. This approach has revealed and introduced questions that were not fully appreciated by those carrying out traditional Sanger sequencing-based methods. These include the effects of alignment quality, the best method of calculating pairwise genetic distances for 16S rRNA genes, whether it is appropriate to filter variable regions, and how the choice of variable region relates to the genetic diversity observed in full-length sequences. I used a diverse collection of 13,501 high-quality full-length sequences to assess each of these questions. First, alignment quality had a significant impact on distance values and downstream analyses. Specifically, the greengenes alignment, which does a poor job of aligning variable regions, predicted higher genetic diversity, richness, and phylogenetic diversity than the SILVA and RDP-based alignments. Second, the effect of different gap treatments in determining pairwise genetic distances was strongly affected by the variation in sequence length for a region; however, the effect of different calculation methods was subtle when determining the sample's richness or phylogenetic diversity for a region. Third, applying a sequence mask to remove variable positions had a profound impact on genetic distances by muting the observed richness and phylogenetic diversity. Finally, the genetic distances calculated for each of the variable regions did a poor job of correlating with the full-length gene. Thus, while it is tempting to apply traditional cutoff levels derived for full-length sequences to these shorter sequences, it is not advisable. Analysis of beta-diversity metrics showed that each of these factors can have a significant impact on the comparison of community membership and structure. Taken together, these results urge caution in the design and interpretation of analyses using pyrosequencing data.
Keywords
Affiliated Institutions
Related Publications
A High-Throughput DNA Sequence Aligner for Microbial Ecology Studies
As the scope of microbial surveys expands with the parallel growth in sequencing capacity, a significant bottleneck in data analysis is the ability to generate a biologically me...
Generation of Multimillion-Sequence 16S rRNA Gene Libraries from Complex Microbial Communities by Assembling Paired-End Illumina Reads
ABSTRACT Microbial communities host unparalleled taxonomic diversity. Adequate characterization of environmental and host-associated samples remains a challenge for microbiologi...
Short pyrosequencing reads suffice for accurate microbial community analysis
Pyrosequencing technology allows us to characterize microbial communities using 16S ribosomal RNA (rRNA) sequences orders of magnitude faster and more cheaply than has previousl...
Exploring Microbial Diversity Using 16S rRNA High-Throughput Methods
As a result of advancements in high-throughput technology, the sequencing of the pioneering 16S rRNA gene marker is gradually shedding light on the taxonomic characterization of...
The 16s/23s ribosomal spacer region as a target for DNA probes to identify eubacteria.
Variable regions of the 16s ribosomal RNA have been frequently used as the target for DNA probes to identify microorganisms. In some situations, however, there is very little se...
Publication Info
- Year
- 2010
- Type
- article
- Volume
- 6
- Issue
- 7
- Pages
- e1000844-e1000844
- Citations
- 394
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1371/journal.pcbi.1000844