Abstract

Abstract Here, we present a diverse, structurally nonredundant data set of two‐chain protein–protein interfaces derived from the PDB. Using a sequence order‐independent structural comparison algorithm and hierarchical clustering, 3799 interface clusters are obtained. These yield 103 clusters with at least five nonhomologous members. We divide the clusters into three types. In Type I clusters, the global structures of the chains from which the interfaces are derived are also similar. This cluster type is expected because, in general, related proteins associate in similar ways. In Type II, the interfaces are similar; however, remarkably, the overall structures and functions of the chains are different. The functional spectrum is broad, from enzymes/inhibitors to immunoglobulins and toxins. The fact that structurally different monomers associate in similar ways, suggests “good” binding architectures. This observation extends a paradigm in protein science: It has been well known that proteins with similar structures may have different functions. Here, we show that it extends to interfaces. In Type III clusters, only one side of the interface is similar across the cluster. This structurally nonredundant data set provides rich data for studies of protein–protein interactions and recognition, cellular networks and drug design. In particular, it may be useful in addressing the difficult question of what are the favorable ways for proteins to interact. (The data set is available at http://protein3d.ncifcrf.gov/∼keskino/ and http://home.ku.edu.tr/∼okeskin/INTERFACE/INTERFACES.html .)

Keywords

Protein Data Bank (RCSB PDB)Cluster analysisInterface (matter)Computational biologyComputer scienceSet (abstract data type)Protein–protein interactionProtein structureCluster (spacecraft)Hierarchical clusteringSequence (biology)Data miningTheoretical computer scienceBiologyArtificial intelligenceGeneticsBiochemistry

Affiliated Institutions

Related Publications

Publication Info

Year
2004
Type
article
Volume
13
Issue
4
Pages
1043-1055
Citations
201
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

201
OpenAlex

Cite This

Özlem Keskin, Chung‐Jung Tsai, Haim J. Wolfson et al. (2004). A new, structurally nonredundant, diverse data set of protein–protein interfaces and its implications. Protein Science , 13 (4) , 1043-1055. https://doi.org/10.1110/ps.03484604

Identifiers

DOI
10.1110/ps.03484604