Abstract
Abstract Motivation: Identification of a transcription factor binding sites is an important aspect of the analysis of genetic regulation. Many programs have been developed for the de novo discovery of a binding motif (collection of binding sites). Recently, a scoring function formulation was derived that allows for the comparison of discovered motifs from different programs [S.T. Jensen, X.S. Liu, Q. Zhou and J.S. Liu (2004) Stat. Sci., 19, 188–204.] A simple program, BioOptimizer, was proposed in [S.T. Jensen and J.S. Liu (2004) Bioinformatics, 20, 1557–1564.] that improved discovered motifs by optimizing a scoring function. However, BioOptimizer is a very simple algorithm that can only make local improvements upon an already discovered motif and so BioOptimizer can only be used in conjunction with other motif-finding software. Results: We introduce software, GAME, which utilizes a genetic algorithm to find optimal motifs in DNA sequences. GAME evolves motifs with high fitness from a population of randomly generated starting motifs, which eliminate the reliance on additional motif-finding programs. In addition to using standard genetic operations, GAME also incorporates two additional operators that are specific to the motif discovery problem. We demonstrate the superior performance of GAME compared with MEME, BioProspector and BioOptimizer in simulation studies as well as several real data applications where we use an extended version of the GAME algorithm that allows the motif width to be unknown. Availability: Contact: zhiwei@mail.med.upenn.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Keywords
Affiliated Institutions
Related Publications
FANMOD: a tool for fast network motif detection
Abstract Summary: Motifs are small connected subnetworks that a network displays in significantly higher frequencies than would be expected for a random network. They have recen...
RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference
Abstract Motivation Phylogenies are important for fundamental biological research, but also have numerous applications in biotechnology, agriculture and medicine. Finding the op...
Predicting functionally important residues from sequence conservation
Abstract Motivation: All residues in a protein are not equally important. Some are essential for the proper structure and function of the protein, whereas others can be readily ...
Genome-Wide Mapping of in Vivo Protein-DNA Interactions
In vivo protein-DNA interactions connect each transcription factor with its direct targets to form a gene network scaffold. To map these protein-DNA interactions comprehensively...
Transcriptional Regulatory Networks in <i>Saccharomyces cerevisiae</i>
We have determined how most of the transcriptional regulators encoded in the eukaryote Saccharomyces cerevisiae associate with genes across the genome in living cells. Just as m...
Publication Info
- Year
- 2006
- Type
- article
- Volume
- 22
- Issue
- 13
- Pages
- 1577-1584
- Citations
- 97
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1093/bioinformatics/btl147