Abstract

Single-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.

Keywords

BiologyComputational biologyData integrationComputer scienceData mining

MeSH Terms

DatabasesNucleic AcidGene Expression ProfilingHumansSequence AnalysisRNASingle-Cell AnalysisSoftwareTranscriptome

Affiliated Institutions

Related Publications

Publication Info

Year
2019
Type
article
Volume
177
Issue
7
Pages
1888-1902.e21
Citations
15461
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

15461
OpenAlex
13587
CrossRef

Cite This

Tim Stuart, Andrew Butler, Paul Hoffman et al. (2019). Comprehensive Integration of Single-Cell Data. Cell , 177 (7) , 1888-1902.e21. https://doi.org/10.1016/j.cell.2019.05.031

Identifiers

DOI
10.1016/j.cell.2019.05.031
PMID
31178118
PMCID
PMC6687398

Data Quality

Data completeness: 90%