Abstract
We present a framework for the design of optimal assembly algorithms for shotgun sequencing under the criterion of complete reconstruction. We derive a lower bound on the read length and the coverage depth required for reconstruction in terms of the repeat statistics of the genome. Building on earlier works, we design a de Brujin graph based assembly algorithm which can achieve very close to the lower bound for repeat statistics of a wide range of sequenced genomes, including the GAGE datasets. The results are based on a set of necessary and sufficient conditions on the DNA sequence and the reads for reconstruction. The conditions can be viewed as the shotgun sequencing analogue of Ukkonen-Pevzner's necessary and sufficient conditions for Sequencing by Hybridization.
Keywords
MeSH Terms
Affiliated Institutions
Related Publications
A New Algorithm for DNA Sequence Assembly
Since the advent of rapid DNA sequencing methods in 1976, scientists have had the problem of inferring DNA sequences from sequenced fragments. Shotgun sequencing is a well-estab...
Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study
Abstract Background With the fast advances in nextgen sequencing technology, high-throughput RNA sequencing has emerged as a powerful and cost-effective way for transcriptome st...
SOAPdenovo2: an empirically improved memory-efficient short-read <i>de novo</i> assembler
Abstract Background There is a rapidly increasing amount of de novo genome assembly using next-generation sequencing (NGS) short reads; however, several big challenges remain to...
The Phusion Assembler
The Phusion assembler has assembled the mouse genome from the whole-genome shotgun (WGS) dataset collected by the Mouse Genome Sequencing Consortium, at ∼7.5× sequence coverage,...
Efficiently detecting polymorphisms during the fragment assembly process
Abstract Motivation: Current genomic sequence assemblers assume that the input data is derived from a single, homogeneous source. However, recent whole-genome shotgun sequencing...
Publication Info
- Year
- 2013
- Type
- article
- Volume
- 14
- Issue
- S5
- Pages
- S18-S18
- Citations
- 90
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1186/1471-2105-14-s5-s18
- PMID
- 23902516
- PMCID
- PMC3706340
- arXiv
- 1301.0068