Abstract

Abstract Protein structures come in families. Are families “closely knit” or “loosely knit” entities? We describe a measure of relatedness among polymer conformations. Based on weighted distance maps, this measure differs from existing measures mainly in two respects: (1) it is computationally fast, and (2) it can compare any two proteins, regardless of their relative chain lengths or degree of similarity. It does not require finding relative alignments. The measure is used here to determine the dissimilarities between all 12, 403 possible pairs of 158 diverse protein structures from the Brookhaven Protein Data Bank (PDB). Combined with minimal spanning trees and hierarchical clustering methods, this measure is used to define structural families. It is also useful for rapidly searching a dataset of protein structures for specific substructural motifs. By using an analogy to distributions of Euclidean distances, we find that protein families are not tightly knit entities.

Keywords

Protein Data BankMeasure (data warehouse)Globular proteinProtein Data Bank (RCSB PDB)Similarity measureCluster analysisSimilarity (geometry)Hierarchical clusteringProtein structureProtein familyEuclidean distanceComputer scienceMathematicsBiologyArtificial intelligenceData miningGeneticsGene

Affiliated Institutions

Related Publications

RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences

Abstract The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), the US data center for the global PDB archive and a founding member of the Worldw...

2020 Nucleic Acids Research 1465 citations

Touring protein fold space with Dali/FSSP

The FSSP database and its new supplement, the Dali Domain Dictionary, present a continuously updated classification of all known 3D protein structures. The classification is der...

1998 Nucleic Acids Research 667 citations

Publication Info

Year
1993
Type
article
Volume
2
Issue
6
Pages
884-899
Citations
70
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

70
OpenAlex

Cite This

David P. Yee, Ken A. Dill (1993). Families and the structural relatedness among globular proteins. Protein Science , 2 (6) , 884-899. https://doi.org/10.1002/pro.5560020603

Identifiers

DOI
10.1002/pro.5560020603