Protein homology detection by HMM–HMM comparison

Abstract

Abstract Motivation: Protein homology detection and sequence alignment are at the basis of protein structure prediction, function prediction and evolution. Results: We have generalized the alignment of protein sequences with a profile hidden Markov model (HMM) to the case of pairwise alignment of profile HMMs. We present a method for detecting distant homologous relationships between proteins based on this approach. The method (HHsearch) is benchmarked together with BLAST, PSI-BLAST, HMMER and the profile–profile comparison tools PROF_SIM and COMPASS, in an all-against-all comparison of a database of 3691 protein domains from SCOP 1.63 with pairwise sequence identities below 20%. Sensitivity: When the predicted secondary structure is included in the HMMs, HHsearch is able to detect between 2.7 and 4.2 times more homologs than PSI-BLAST or HMMER and between 1.44 and 1.9 times more than COMPASS or PROF_SIM for a rate of false positives of 10%. Approximately half of the improvement over the profile–profile comparison methods is attributable to the use of profile HMMs in place of simple profiles. Alignment quality: Higher sensitivity is mirrored by an increased alignment quality. HHsearch produced 1.2, 1.7 and 3.3 times more good alignments (‘balanced’ score &gt;0.3) than the next best method (COMPASS), and 1.6, 2.9 and 9.4 times more than PSI-BLAST, at the family, superfamily and fold level, respectively. Speed: HHsearch scans a query of 200 residues against 3691 domains in 33 s on an AMD64 2GHz PC. This is 10 times faster than PROF_SIM and 17 times faster than COMPASS. Availability: HHsearch can be downloaded from http://www.protevo.eb.tuebingen.mpg.de/download/ together with up-to-date versions of SCOP and PFAM. A web server is available at http://www.protevo.eb.tuebingen.mpg.de/toolkit/index.php?view=hhpred Contact: johannes.soeding@tuebingen.mpg.de

Keywords

Hidden Markov modelFalse positive paradoxSequence alignmentPairwise comparisonMultiple sequence alignmentHomology (biology)Protein function predictionSmith–Waterman algorithmPattern recognition (psychology)Homology modelingComputer scienceArtificial intelligenceComputational biologyGeneticsBiologyPeptide sequenceProtein functionGene

Affiliated Institutions

Max Planck Institute for Developmental Biology DE

Related Publications

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

Stephen F. Altschul

The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and s...

1997 Nucleic Acids Research 73388 citations

The Jpred 3 secondary structure prediction server

Christian Cole , Jonathan D. Barber , Geoffrey J. Barton

Jpred (http://www.compbio.dundee.ac.uk/jpred) is a secondary structure prediction server powered by the Jnet algorithm. Jpred performs over 1000 predictions per week for users i...

2008 Nucleic Acids Research 1439 citations

The Pfam protein families database in 2019

Sara El-Gebali , Jaina Mistry , Alex Bateman +13 more

This FAIRsharing record describes: The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs)....

2018 Nucleic Acids Research 4869 citations

The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data

C. T. Porter

The Catalytic Site Atlas (CSA) provides catalytic residue annotation for enzymes in the Protein Data Bank. It is available online at http://www.ebi.ac.uk/thornton-srv/databases/...

2003 Nucleic Acids Research 608 citations

MAFFT version 5: improvement in accuracy of multiple sequence alignment

Kazutaka Katoh

The accuracy of multiple sequence alignment program MAFFT has been improved. The new version (5.3) of MAFFT offers new iterative refinement options, H-INS-i, F-INS-i and G-INS-i...

2005 Nucleic Acids Research 4851 citations

Publication Info

Year: 2004
Type: article
Volume: 21
Issue: 7
Pages: 951-960
Citations: 2470
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Protein homology detection by HMM–HMM comparison

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

2470

OpenAlex

Cite This

APA Style

                            
                                    Johannes Söding
                                
                            (2004). 
                            Protein homology detection by HMM–HMM comparison. 
                            Bioinformatics
                            , 21
                            (7)
                            , 951-960.
                            https://doi.org/10.1093/bioinformatics/bti125

Identifiers

DOI: 10.1093/bioinformatics/bti125