Abstract

Abstract Background SAMtools and BCFtools are widely used programs for processing and analysing high-throughput sequencing data. They include tools for file format conversion and manipulation, sorting, querying, statistics, variant calling, and effect analysis amongst other methods. Findings The first version appeared online 12 years ago and has been maintained and further developed ever since, with many new features and improvements added over the years. The SAMtools and BCFtools packages represent a unique collection of tools that have been used in numerous other software projects and countless genomic pipelines. Conclusion Both SAMtools and BCFtools are freely available on GitHub under the permissive MIT licence, free for both non-commercial and commercial use. Both packages have been installed >1 million times via Bioconda. The source code and documentation are available from https://www.htslib.org.

Keywords

DocumentationComputer scienceSoftwareSortingFile formatWorld Wide WebData scienceDatabaseProgramming language

MeSH Terms

GenomeGenomicsHigh-Throughput Nucleotide SequencingSoftware

Affiliated Institutions

Related Publications

Publication Info

Year
2021
Type
article
Volume
10
Issue
2
Citations
13080
Access
Closed

Social Impact

Altmetric

Social media, news, blog, policy document mentions

Citation Metrics

13080
OpenAlex
1500
Influential
11920
CrossRef

Cite This

Petr Danecek, James Bonfield, Jennifer Liddle et al. (2021). Twelve years of SAMtools and BCFtools. GigaScience , 10 (2) . https://doi.org/10.1093/gigascience/giab008

Identifiers

DOI
10.1093/gigascience/giab008
PMID
33590861
PMCID
PMC7931819
arXiv
2012.10295

Data Quality

Data completeness: 93%