Abstract

BackgroundAmino-terminal signal peptides (SPs) are short regions that guide the targeting of secretory proteins to the correct subcellular compartments in the cell. They are cleaved off upon the passenger protein reaching its destination. The explosive growth in sequencing technologies has led to the deposition of vast numbers of protein sequences necessitating rapid functional annotation techniques, with subcellular localization being a key feature. Of the myriad software prediction tools developed to automate the task of assigning the SP cleavage site of these new sequences, we review here, the performance and reliability of commonly used SP prediction tools.ResultsThe available signal peptide data has been manually curated and organized into three datasets representing eukaryotes, Gram-positive and Gram-negative bacteria. These datasets are used to evaluate thirteen prediction tools that are publicly available. SignalP (both the HMM and ANN versions) maintains consistency and achieves the best overall accuracy in all three benchmarking experiments, ranging from 0.872 to 0.914 although other prediction tools are narrowing the performance gap.ConclusionThe majority of the tools evaluated in this study encounter no difficulty in discriminating between secretory and non-secretory proteins. The challenge clearly remains with pinpointing the correct SP cleavage site. The composite scoring schemes employed by SignalP may help to explain its accuracy. Prediction task is divided into a number of separate steps, thus allowing each score to tackle a particular aspect of the prediction.

Keywords

Signal peptideComputer scienceBenchmarkingProtein sequencingComputational biologyArtificial intelligenceMachine learningData miningBioinformaticsBiologyPeptide sequenceGeneGenetics

MeSH Terms

Computational BiologyDatabasesProteinProtein Sorting SignalsProteinsSequence AnalysisProtein

Affiliated Institutions

Related Publications

Publication Info

Year
2009
Type
article
Volume
10
Issue
S15
Pages
S2-S2
Citations
70
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

70
OpenAlex
4
Influential
54
CrossRef

Cite This

Khar Heng Choo, Tin Wee Tan, Shoba Ranganathan (2009). A comprehensive assessment of N-terminal signal peptides prediction methods. BMC Bioinformatics , 10 (S15) , S2-S2. https://doi.org/10.1186/1471-2105-10-s15-s2

Identifiers

DOI
10.1186/1471-2105-10-s15-s2
PMID
19958512
PMCID
PMC2788353

Data Quality

Data completeness: 86%