Abstract

In order to extract the maximum amount of information from the rapidly accumulating genome sequences, all conserved genes need to be classified according to their homologous relationships. Comparison of proteins encoded in seven complete genomes from five major phylogenetic lineages and elucidation of consistent patterns of sequence similarities allowed the delineation of 720 clusters of orthologous groups (COGs). Each COG consists of individual orthologous proteins or orthologous sets of paralogs from at least three lineages. Orthologs typically have the same function, allowing transfer of functional information from one member to an entire COG. This relation automatically yields a number of functional predictions for poorly characterized genomes. The COGs comprise a framework for functional and evolutionary genome analysis.

Keywords

GenomeBiologyPhylogenetic treeGeneticsComputational biologyGenePhylogeneticsFunction (biology)Conserved sequenceEvolutionary biologyBase sequence

MeSH Terms

Amino Acid SequenceArchaeal ProteinsBacteriaBacterial ProteinsConserved SequenceEvolutionMolecularFungal ProteinsGenesArchaealGenesBacterialGenesFungalMethanococcusMultigene FamilyPhylogenyProteinsSaccharomyces cerevisiaeSpecies Specificity

Affiliated Institutions

Related Publications

Publication Info

Year
1997
Type
review
Volume
278
Issue
5338
Pages
631-637
Citations
3583
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

3583
OpenAlex
289
Influential
2900
CrossRef

Cite This

Roman L. Tatusov, Eugene V. Koonin, David J. Lipman (1997). A Genomic Perspective on Protein Families. Science , 278 (5338) , 631-637. https://doi.org/10.1126/science.278.5338.631

Identifiers

DOI
10.1126/science.278.5338.631
PMID
9381173

Data Quality

Data completeness: 86%