Abstract
We present Rsubread, a Bioconductor software package that provides high-performance alignment and read counting functions for RNA-seq reads. Rsubread is based on the successful Subread suite with the added ease-of-use of the R programming environment, creating a matrix of read counts directly as an R object ready for downstream analysis. It integrates read mapping and quantification in a single package and has no software dependencies other than R itself. We demonstrate Rsubread's ability to detect exon-exon junctions de novo and to quantify expression at the level of either genes, exons or exon junctions. The resulting read counts can be input directly into a wide range of downstream statistical analyses using other Bioconductor packages. Using SEQC data and simulations, we compare Rsubread to TopHat2, STAR and HTSeq as well as to counting functions in the Bioconductor infrastructure packages. We consider the performance of these tools on the combined quantification task starting from raw sequence reads through to summary counts, and in particular evaluate the performance of different combinations of alignment and counting algorithms. We show that Rsubread is faster and uses less memory than competitor tools and produces read count summaries that more accurately correlate with true values.
Keywords
MeSH Terms
Affiliated Institutions
Related Publications
Fast and accurate short read alignment with Burrows–Wheeler transform
Abstract Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A...
Minimap2: pairwise alignment for nucleotide sequences
Abstract Motivation Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic cont...
ART: a next-generation sequencing read simulator
Abstract Summary: ART is a set of simulation tools that generate synthetic next-generation sequencing reads. This functionality is essential for testing and benchmarking tools f...
Fast and accurate long-read alignment with Burrows–Wheeler transform
Abstract Motivation: Many programs for aligning short sequencing reads to a reference genome have been developed in the last 2 years. Most of them are very efficient for short r...
Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R
Abstract Motivation Single-cell RNA sequencing (scRNA-seq) is increasingly used to study gene expression at the level of individual cells. However, preparing raw sequence data f...
Publication Info
- Year
- 2019
- Type
- article
- Volume
- 47
- Issue
- 8
- Pages
- e47-e47
- Citations
- 2941
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1093/nar/gkz114
- PMID
- 30783653
- PMCID
- PMC6486549