Abstract

Next-generation sequencing is limited to short read lengths and by high error rates. We systematically analyzed sources of noise in the Illumina Genome Analyzer that contribute to these high error rates and developed a base caller, Alta-Cyclic, that uses machine learning to compensate for noise factors. Alta-Cyclic substantially improved the number of accurate reads for sequencing runs up to 78 bases and reduced systematic biases, facilitating confident identification of sequence variants.

Keywords

Identification (biology)Computational biologyComputer scienceNoise (video)DNA sequencingBase (topology)GenomeBiologyGeneticsArtificial intelligenceGeneMathematics

MeSH Terms

AnimalsDatabasesNucleic AcidHumansResearch DesignSensitivity and SpecificitySequence AnalysisDNASoftware

Affiliated Institutions

Related Publications

Publication Info

Year
2008
Type
article
Volume
5
Issue
8
Pages
679-682
Citations
207
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

207
OpenAlex
5
Influential
162
CrossRef

Cite This

Yaniv Erlich, Partha P. Mitra, Melissa delaBastide et al. (2008). Alta-Cyclic: a self-optimizing base caller for next-generation sequencing. Nature Methods , 5 (8) , 679-682. https://doi.org/10.1038/nmeth.1230

Identifiers

DOI
10.1038/nmeth.1230
PMID
18604217
PMCID
PMC2978646

Data Quality

Data completeness: 86%