Abstract

Abstract Motivation: Identification of protein–ligand binding sites is critical to protein function annotation and drug discovery. However, there is no method that could generate optimal binding site prediction for different protein types. Combination of complementary predictions is probably the most reliable solution to the problem. Results: We develop two new methods, one based on binding-specific substructure comparison (TM-SITE) and another on sequence profile alignment (S-SITE), for complementary binding site predictions. The methods are tested on a set of 500 non-redundant proteins harboring 814 natural, drug-like and metal ion molecules. Starting from low-resolution protein structure predictions, the methods successfully recognize >51% of binding residues with average Matthews correlation coefficient (MCC) significantly higher (with P-value <10–9 in student t-test) than other state-of-the-art methods, including COFACTOR, FINDSITE and ConCavity. When combining TM-SITE and S-SITE with other structure-based programs, a consensus approach (COACH) can increase MCC by 15% over the best individual predictions. COACH was examined in the recent community-wide COMEO experiment and consistently ranked as the best method in last 22 individual datasets with the Area Under the Curve score 22.5% higher than the second best method. These data demonstrate a new robust approach to protein–ligand binding site recognition, which is ready for genome-wide structure-based function annotations. Availability: http://zhanglab.ccmb.med.umich.edu/COACH/ Contact: zhng@umich.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Keywords

AnnotationComputational biologyComputer scienceBinding siteSubstructureSequence (biology)Drug discoveryIdentification (biology)Protein methodsFunction (biology)Protein functionData miningBioinformaticsArtificial intelligenceSequence analysisBiologyGeneticsDNAEngineering

MeSH Terms

AlgorithmsBinding SitesDatabasesProteinHumansLigandsModelsMolecularProtein StructureTertiaryProteinsSequence AlignmentSequence AnalysisProtein

Affiliated Institutions

Related Publications

Publication Info

Year
2013
Type
article
Volume
29
Issue
20
Pages
2588-2595
Citations
1014
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1014
OpenAlex
45
Influential
860
CrossRef

Cite This

Jianyi Yang, Ambrish Roy, Yang Zhang (2013). Protein–ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics , 29 (20) , 2588-2595. https://doi.org/10.1093/bioinformatics/btt447

Identifiers

DOI
10.1093/bioinformatics/btt447
PMID
23975762
PMCID
PMC3789548

Data Quality

Data completeness: 86%