Abstract

Abstract Motivation: Protein homology detection and sequence alignment are at the basis of protein structure prediction, function prediction and evolution. Results: We have generalized the alignment of protein sequences with a profile hidden Markov model (HMM) to the case of pairwise alignment of profile HMMs. We present a method for detecting distant homologous relationships between proteins based on this approach. The method (HHsearch) is benchmarked together with BLAST, PSI-BLAST, HMMER and the profile–profile comparison tools PROF_SIM and COMPASS, in an all-against-all comparison of a database of 3691 protein domains from SCOP 1.63 with pairwise sequence identities below 20%. Sensitivity: When the predicted secondary structure is included in the HMMs, HHsearch is able to detect between 2.7 and 4.2 times more homologs than PSI-BLAST or HMMER and between 1.44 and 1.9 times more than COMPASS or PROF_SIM for a rate of false positives of 10%. Approximately half of the improvement over the profile–profile comparison methods is attributable to the use of profile HMMs in place of simple profiles. Alignment quality: Higher sensitivity is mirrored by an increased alignment quality. HHsearch produced 1.2, 1.7 and 3.3 times more good alignments (‘balanced’ score >0.3) than the next best method (COMPASS), and 1.6, 2.9 and 9.4 times more than PSI-BLAST, at the family, superfamily and fold level, respectively. Speed: HHsearch scans a query of 200 residues against 3691 domains in 33 s on an AMD64 2GHz PC. This is 10 times faster than PROF_SIM and 17 times faster than COMPASS. Availability: HHsearch can be downloaded from http://www.protevo.eb.tuebingen.mpg.de/download/ together with up-to-date versions of SCOP and PFAM. A web server is available at http://www.protevo.eb.tuebingen.mpg.de/toolkit/index.php?view=hhpred Contact: johannes.soeding@tuebingen.mpg.de

Keywords

Hidden Markov modelFalse positive paradoxSequence alignmentPairwise comparisonMultiple sequence alignmentHomology (biology)Protein function predictionSmith–Waterman algorithmPattern recognition (psychology)Homology modelingComputer scienceArtificial intelligenceComputational biologyGeneticsBiologyPeptide sequenceProtein functionGene

Affiliated Institutions

Related Publications

Publication Info

Year
2004
Type
article
Volume
21
Issue
7
Pages
951-960
Citations
2470
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

2470
OpenAlex

Cite This

Johannes Söding (2004). Protein homology detection by HMM–HMM comparison. Bioinformatics , 21 (7) , 951-960. https://doi.org/10.1093/bioinformatics/bti125

Identifiers

DOI
10.1093/bioinformatics/bti125