Abstract

High-throughput technologies are widely used, for example to assay genetic variants, gene and protein expression, and epigenetic modifications. One often overlooked complication with such studies is batch effects, which occur because measurements are affected by laboratory conditions, reagent lots and personnel differences. This becomes a major problem when batch effects are correlated with an outcome of interest and lead to incorrect conclusions. Using both published studies and our own analyses, we argue that batch effects (as well as other technical and biological artefacts) are widespread and critical to address. We review experimental and computational approaches for doing so.

Keywords

ThroughputBiologyComputational biologyEpigeneticsComputer scienceBiochemical engineeringGeneGeneticsEngineering

MeSH Terms

BiotechnologyComputational BiologyGenomicsOligonucleotide Array Sequence AnalysisPeriodicals as TopicResearch DesignSequence AnalysisDNA

Affiliated Institutions

Related Publications

Publication Info

Year
2010
Type
review
Volume
11
Issue
10
Pages
733-739
Citations
2101
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

2101
OpenAlex
53
Influential
1768
CrossRef

Cite This

Jeffrey T. Leek, Robert B. Scharpf, Héctor Corrada Bravo et al. (2010). Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics , 11 (10) , 733-739. https://doi.org/10.1038/nrg2825

Identifiers

DOI
10.1038/nrg2825
PMID
20838408
PMCID
PMC3880143

Data Quality

Data completeness: 86%