Abstract

Deep sequencing of RNAs (RNA-seq) has been a useful tool to characterize and quantify transcriptomes. However, there are significant challenges in the analysis of RNA-seq data, such as how to separate signals from sequencing bias and how to perform reasonable normalization. Here, we focus on a fundamental question in RNA-seq analysis: the distribution of the position-level read counts. Specifically, we propose a two-parameter generalized Poisson (GP) model to the position-level read counts. We show that the GP model fits the data much better than the traditional Poisson model. Based on the GP model, we can better estimate gene or exon expression, perform a more reasonable normalization across different samples, and improve the identification of differentially expressed genes and the identification of differentially spliced exons. The usefulness of the GP model is demonstrated by applications to multiple RNA-seq data sets.

Keywords

BiologyNormalization (sociology)RNA-SeqPoisson distributionComputational biologyRNADeep sequencingTranscriptomeExonIdentification (biology)GeneGeneticsGene expressionStatisticsMathematicsGenome

Affiliated Institutions

Related Publications

Publication Info

Year
2010
Type
article
Volume
38
Issue
17
Pages
e170-e170
Citations
151
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

151
OpenAlex

Cite This

Sudeep Srivastava, Liang Chen (2010). A two-parameter generalized Poisson model to improve the analysis of RNA-seq data. Nucleic Acids Research , 38 (17) , e170-e170. https://doi.org/10.1093/nar/gkq670

Identifiers

DOI
10.1093/nar/gkq670