Abstract

Here, we present PatMatch, an efficient, web-based pattern-matching program that enables searches for short nucleotide or peptide sequences such as cis-elements in nucleotide sequences or small domains and motifs in protein sequences. The program can be used to find matches to a user-specified sequence pattern that can be described using ambiguous sequence codes and a powerful and flexible pattern syntax based on regular expressions. A recent upgrade has improved performance and now supports both mismatches and wildcards in a single pattern. This enhancement has been achieved by replacing the previous searching algorithm, scan_for_matches [D'Souza et al. (1997), Trends in Genetics, 13, 497-498], with nondeterministic-reverse grep (NR-grep), a general pattern matching tool that allows for approximate string matching [Navarro (2001), Software Practice and Experience, 31, 1265-1312]. We have tailored NR-grep to be used for DNA and protein searches with PatMatch. The stand-alone version of the software can be adapted for use with any sequence dataset and is available for download at The Arabidopsis Information Resource (TAIR) at ftp://ftp.arabidopsis.org/home/tair/Software/Patmatch/. The PatMatch server is available on the web at http://www.arabidopsis.org/cgi-bin/patmatch/nph-patmatch.pl for searching Arabidopsis thaliana sequences.

Keywords

BiologySoftwareArabidopsisWeb serverSequence (biology)File Transfer ProtocolComputer scienceString (physics)Approximate string matchingComputational biologyGenBankPattern matchingGeneticsProgramming languageThe InternetOperating systemMathematics

Affiliated Institutions

Related Publications

Publication Info

Year
2005
Type
article
Volume
33
Issue
Web Server
Pages
W262-W266
Citations
189
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

189
OpenAlex

Cite This

Tingting Yan, Dong Hyun Yoo, Tanya Berardini et al. (2005). PatMatch: a program for finding patterns in peptide and nucleotide sequences. Nucleic Acids Research , 33 (Web Server) , W262-W266. https://doi.org/10.1093/nar/gki368

Identifiers

DOI
10.1093/nar/gki368