Abstract

Recent advances in DNA sequencing technology have allowed the collection of high-dimensional data from human-associated microbial communities on an unprecedented scale. A major goal of these studies is the identification of important groups of microorganisms that vary according to physiological or disease states in the host, but the incidence of rare taxa and the large numbers of taxa observed make that goal difficult to obtain using traditional approaches. Fortunately, similar problems have been addressed by the machine learning community in other fields of study such as microarray analysis and text classification. In this review, we demonstrate that several existing supervised classifiers can be applied effectively to microbiota classification, both for selecting subsets of taxa that are highly discriminative of the type of community, and for building models that can accurately classify unlabeled data. To encourage the development of new approaches to supervised classification of microbiota, we discuss several structures inherent in microbial community data that may be available for exploitation in novel approaches, and we include as supplemental information several benchmark classification tasks for use by the community.

Keywords

Identification (biology)Machine learningDiscriminative modelArtificial intelligenceBiologyBenchmark (surveying)Human microbiomeMicrobiomeCluster analysisComputer scienceComputational biologyData scienceBioinformaticsEcology

Affiliated Institutions

Related Publications

Publication Info

Year
2010
Type
review
Volume
35
Issue
2
Pages
343-359
Citations
450
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

450
OpenAlex

Cite This

Dan Knights, Elizabeth K. Costello, Rob Knight (2010). Supervised classification of human microbiota. FEMS Microbiology Reviews , 35 (2) , 343-359. https://doi.org/10.1111/j.1574-6976.2010.00251.x

Identifiers

DOI
10.1111/j.1574-6976.2010.00251.x