Abstract

Abstract Inexpensive de novo genome sequencing, particularly in organisms with small genomes, is now possible using several new sequencing technologies. Some of these technologies such as that from Illumina's Solexa Sequencing, produce high genomic coverage by generating a very large number of small reads (∼30 bp). While prior work shows that partial assembly can be performed by k-mer extension in error-free reads, this algorithm is unsuccessful with the sequencing error rates found in practice. We present VCAKE (Verified Consensus Assembly by K-mer Extension), a modification of simple k-mer extension that overcomes error by using high depth coverage. Though it is a simple modification of a previous approach, we show significant improvements in assembly results on simulated and experimental datasets that include error. Availability: http://152.2.15.114/~labweb/VCAKE Contact: william.jeck@gmail.com

Keywords

k-merSequence assemblyExtension (predicate logic)DNA sequencingComputer scienceHybrid genome assemblySimple (philosophy)GenomeComputational biologyAlgorithmDeep sequencingError detection and correctionReference genomeBiologyDNAGeneticsGene

Affiliated Institutions

Related Publications

Publication Info

Year
2007
Type
article
Volume
23
Issue
21
Pages
2942-2944
Citations
267
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

267
OpenAlex

Cite This

William R. Jeck, Josephine A. Reinhardt, David A. Baltrus et al. (2007). Extending assembly of short DNA sequences to handle error. Bioinformatics , 23 (21) , 2942-2944. https://doi.org/10.1093/bioinformatics/btm451

Identifiers

DOI
10.1093/bioinformatics/btm451