Abstract
The analysis of data from automated DNA sequencing instruments has been a limiting factor in the development of new sequencing technology. A new base-calling algorithm that is intended to be independent of any particular sequencing technology has been developed and shown to be effective with data from the Applied Biosystems 373 sequencing system. This algorithm makes use of a nonlinear deconvolution filter to detect likely oligomer events and a graph theoretic editing strategy to find the subset of those events that is most likely to correspond to the correct sequence. Metrics evaluating the quality and accuracy of the resulting sequence are also generated and have been shown to be predictive of measured error rates. Compared to the Applied Biosystems Analysis software, this algorithm generates 18% fewer insertion errors, 80% more deletion errors, and 4% fewer mismatches. The tradeoff between different types of errors can be controlled through a secondary editing step that inserts or deletes base calls depending on their associated confidence values.
Keywords
Affiliated Institutions
Related Publications
Base-Calling of Automated Sequencer Traces Using<i>Phred.</i> I. Accuracy Assessment
The availability of massive amounts of DNA sequence information has begun to revolutionize the practice of biology. As a result, current large-scale sequencing output, while imp...
ART: a next-generation sequencing read simulator
Abstract Summary: ART is a set of simulation tools that generate synthetic next-generation sequencing reads. This functionality is essential for testing and benchmarking tools f...
Evaluation of next generation sequencing platforms for population targeted sequencing studies
BackgroundNext generation sequencing (NGS) platforms are currently being utilized for targeted sequencing of candidate genes or genomic intervals to perform sequence-based assoc...
Substantial biases in ultra-short read data sets from high-throughput DNA sequencing
Abstract Novel sequencing technologies permit the rapid production of large sequence data sets. These technologies are likely to revolutionize genetics and biomedical research, ...
A parallel graph decomposition algorithm for DNA sequencing with nanopores
Abstract Motivation: With the potential availability of nanopore devices that can sense the bases of translocating single-stranded DNA (ssDNA), it is likely that ‘reads’ of leng...
Publication Info
- Year
- 1996
- Type
- article
- Volume
- 6
- Issue
- 2
- Pages
- 80-91
- Citations
- 40
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1101/gr.6.2.80