Abstract

Abstract After mapping, RNA-Seq data can be summarized by a sequence of read counts commonly modeled as Poisson variables with constant rates along each transcript, which actually fit data poorly. We suggest using variable rates for different positions, and propose two models to predict these rates based on local sequences. These models explain more than 50% of the variations and can lead to improved estimates of gene and isoform expressions for both Illumina and Applied Biosystems data.

Keywords

Poisson distributionRNA-SeqCount dataComputational biologySequence (biology)Variable (mathematics)Constant (computer programming)Computer scienceStatisticsGeneBiologyMathematicsGeneticsGene expressionTranscriptome

Affiliated Institutions

Related Publications

Publication Info

Year
2010
Type
article
Volume
11
Issue
5
Pages
R50-R50
Citations
202
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

202
OpenAlex

Cite This

Jun Li, Hui Jiang, Wing Hung Wong (2010). Modeling non-uniformity in short-read rates in RNA-Seq data. Genome biology , 11 (5) , R50-R50. https://doi.org/10.1186/gb-2010-11-5-r50

Identifiers

DOI
10.1186/gb-2010-11-5-r50