Abstract

Abstract Motivation: Generation of structural models and recognition of homologous relationships for unannotated protein sequences are fundamental problems in bioinformatics. Improving the sensitivity and selectivity of methods designed for these two tasks therefore has downstream benefits for many other bioinformatics applications. Results: We describe the latest implementation of the GenTHREADER method for structure prediction on a genomic scale. The method combines profile–profile alignments with secondary-structure specific gap-penalties, classic pair- and solvation potentials using a linear combination optimized with a regression SVM model. We find this combination significantly improves both detection of useful templates and accuracy of sequence-structure alignments relative to other competitive approaches. We further present a second implementation of the protocol designed for the task of discriminating superfamilies from one another. This method, pDomTHREADER, is the first to incorporate both sequence and structural data directly in this task and improves sensitivity and selectivity over the standard version of pGenTHREADER and three other standard methods for remote homology detection. Contact: d.jones@cs.ucl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.

Keywords

Computer scienceSupport vector machineData miningPattern recognition (psychology)Artificial intelligenceSensitivity (control systems)Computational biologyMachine learningBiology

Affiliated Institutions

Related Publications

Publication Info

Year
2009
Type
article
Volume
25
Issue
14
Pages
1761-1767
Citations
302
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

302
OpenAlex

Cite This

Anna Lobley, Michael I. Sadowski, David T. Jones (2009). pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination. Bioinformatics , 25 (14) , 1761-1767. https://doi.org/10.1093/bioinformatics/btp302

Identifiers

DOI
10.1093/bioinformatics/btp302