Abstract

Abstract Motivation: Most multiple sequence alignment programs use heuristics that sometimes introduce errors into the alignment. The most commonly used methods to correct these errors use iterative techniques to maximize an objective function. We present here an alternative, knowledge-based approach that combines a number of recently developed methods into a two-step refinement process. The alignment is divided horizontally and vertically to form a‘ lattice’ in which well aligned regions can be differentiated. Alignment correction is then restricted to the less reliable regions, leading to a more reliable and efficient refinement strategy Results: The accuracy and reliability of RASCAL is demonstrated using: (i) alignments from the BAliBASE benchmark database, where significant improvements were often observed, with no deterioration of the existing high-quality regions, (ii) a large scale study involving 946 alignments from the ProDom protein domain database, where alignment quality was increased in 68% of the cases; and (iii) an automatic pipeline to obtain a high-quality alignment of 695 full-length nuclear receptor proteins, which took 11 min on a DEC Alpha 6100 computer Availability: RASCAL is available at ftp://ftp-igbmc.u-strasbg.fr/pub/RASCAL Contact: poch@igbmc.u-strasbg.fr Supplementary information: http://bioinfo-igbmc.u-strasbourg.fr/BioInfo/RASCAL/paper/rascal_supp.html * To whom correspondence should be addressed.

Keywords

Computer scienceMultiple sequence alignmentHeuristicsBenchmark (surveying)Pipeline (software)Sequence alignmentAlgorithmSequence (biology)Domain (mathematical analysis)Data miningPattern recognition (psychology)Artificial intelligenceMathematicsPeptide sequenceProgramming language

Affiliated Institutions

Related Publications

Publication Info

Year
2003
Type
article
Volume
19
Issue
9
Pages
1155-1161
Citations
142
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

142
OpenAlex

Cite This

Julie Thompson, J. C. Thierry, Olivier Poch (2003). RASCAL: rapid scanning and correction of multiple sequence alignments. Bioinformatics , 19 (9) , 1155-1161. https://doi.org/10.1093/bioinformatics/btg133

Identifiers

DOI
10.1093/bioinformatics/btg133