Abstract
The abundance of different SSU rRNA ("16S") gene sequences in environmental samples is widely used in studies of microbial ecology as a measure of microbial community structure and diversity. However, the genomic copy number of the 16S gene varies greatly - from one in many species to up to 15 in some bacteria and to hundreds in some microbial eukaryotes. As a result of this variation the relative abundance of 16S genes in environmental samples can be attributed both to variation in the relative abundance of different organisms, and to variation in genomic 16S copy number among those organisms. Despite this fact, many studies assume that the abundance of 16S gene sequences is a surrogate measure of the relative abundance of the organisms containing those sequences. Here we present a method that uses data on sequences and genomic copy number of 16S genes along with phylogenetic placement and ancestral state estimation to estimate organismal abundances from environmental DNA sequence data. We use theory and simulations to demonstrate that 16S genomic copy number can be accurately estimated from the short reads typically obtained from high-throughput environmental sequencing of the 16S gene, and that organismal abundances in microbial communities are more strongly correlated with estimated abundances obtained from our method than with gene abundances. We re-analyze several published empirical data sets and demonstrate that the use of gene abundance versus estimated organismal abundance can lead to different inferences about community diversity and structure and the identity of the dominant taxa in microbial communities. Our approach will allow microbial ecologists to make more accurate inferences about microbial diversity and abundance based on 16S sequence data.
Keywords
MeSH Terms
Affiliated Institutions
Related Publications
The 16s/23s ribosomal spacer region as a target for DNA probes to identify eubacteria.
Variable regions of the 16s ribosomal RNA have been frequently used as the target for DNA probes to identify microorganisms. In some situations, however, there is very little se...
UCHIME improves sensitivity and speed of chimera detection
Abstract Motivation: Chimeric DNA sequences often form during polymerase chain reaction amplification, especially when sequencing single regions (e.g. 16S rRNA or fungal Interna...
NCBI GEO: archive for functional genomics data sets--10 years on
A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve ...
Structure, function and diversity of the healthy human microbiome
Studies of the human microbiome have revealed that even healthy individuals differ remarkably in the microbes that occupy habitats such as the gut, skin and vagina. Much of this...
Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB
ABSTRACT A 16S rRNA gene database ( http://greengenes.lbl.gov ) addresses limitations of public repositories by providing chimera screening, standard alignment, and taxonomic cl...
Publication Info
- Year
- 2012
- Type
- article
- Volume
- 8
- Issue
- 10
- Pages
- e1002743-e1002743
- Citations
- 503
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1371/journal.pcbi.1002743
- PMID
- 23133348
- PMCID
- PMC3486904