Abstract

BackgroundNext generation sequencing (NGS) platforms are currently being utilized for targeted sequencing of candidate genes or genomic intervals to perform sequence-based association studies. To evaluate these platforms for this application, we analyzed human sequence generated by the Roche 454, Illumina GA, and the ABI SOLiD technologies for the same 260 kb in four individuals.ResultsLocal sequence characteristics contribute to systematic variability in sequence coverage (>100-fold difference in per-base coverage), resulting in patterns for each NGS technology that are highly correlated between samples. A comparison of the base calls to 88 kb of overlapping ABI 3730xL Sanger sequence generated for the same samples showed that the NGS platforms all have high sensitivity, identifying >95% of variant sites. At high coverage, depth base calling errors are systematic, resulting from local sequence contexts; as the coverage is lowered additional 'random sampling' errors in base calling occur.ConclusionsOur study provides important insights into systematic biases and data variability that need to be considered when utilizing NGS platforms for population targeted sequencing studies.

Keywords

BiologyHuman geneticsDNA sequencingGenome BiologyComputational biologyPersonal genomicsGenomicsGeneticsEvolutionary biologyGenomeGene

MeSH Terms

Base SequenceComputer SimulationFalse Positive ReactionsGeneticsPopulationGenotypeHumansOligonucleotide Array Sequence AnalysisPolymorphismSingle NucleotideSequence AlignmentSequence AnalysisDNA

Affiliated Institutions

Related Publications

Publication Info

Year
2009
Type
article
Volume
10
Issue
3
Pages
R32-R32
Citations
608
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

608
OpenAlex
32
Influential

Cite This

Olivier Harismendy, Pauline C. Ng, Robert L. Strausberg et al. (2009). Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome biology , 10 (3) , R32-R32. https://doi.org/10.1186/gb-2009-10-3-r32

Identifiers

DOI
10.1186/gb-2009-10-3-r32
PMID
19327155
PMCID
PMC2691003

Data Quality

Data completeness: 86%