Abstract

Abstract Motivation: High-throughput sequencing has made the analysis of new model organisms more affordable. Although assembling a new genome can still be costly and difficult, it is possible to use RNA-seq to sequence mRNA. In the absence of a known genome, it is necessary to assemble these sequences de novo, taking into account possible alternative isoforms and the dynamic range of expression values. Results: We present a software package named Oases designed to heuristically assemble RNA-seq reads in the absence of a reference genome, across a broad spectrum of expression values and in presence of alternative isoforms. It achieves this by using an array of hash lengths, a dynamic filtering of noise, a robust resolution of alternative splicing events and the efficient merging of multiple assemblies. It was tested on human and mouse RNA-seq data and is shown to improve significantly on the transABySS and Trinity de novo transcriptome assemblers. Availability and implementation: Oases is freely available under the GPL license at www.ebi.ac.uk/~zerbino/oases/ Contact: dzerbino@ucsc.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Keywords

Computational biologyRNA-SeqDe novo transcriptome assemblySequence assemblyTranscriptomeBiologySoftwareReference genomeAlternative splicingGenomeRNARNA splicingHash functionComputer scienceGeneticsGene isoformGeneGene expression

Affiliated Institutions

Related Publications

Publication Info

Year
2012
Type
article
Volume
28
Issue
8
Pages
1086-1092
Citations
1452
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1452
OpenAlex

Cite This

Marcel H. Schulz, Daniel R. Zerbino, Martin Vingron et al. (2012). <i>Oases:</i>robust<i>de novo</i>RNA-seq assembly across the dynamic range of expression levels. Bioinformatics , 28 (8) , 1086-1092. https://doi.org/10.1093/bioinformatics/bts094

Identifiers

DOI
10.1093/bioinformatics/bts094