"Estimation of Alternative Splicing Isoform Frequencies from RNA-Seq Da" by Marius Nicolae, Serghei Mangul et al.

Computer Science Faculty Publications

Title

Estimation of Alternative Splicing Isoform Frequencies from RNA-Seq Data

Authors

Marius Nicolae
Serghei MangulFollow
Ion I. Măndoiu
Alexander Zelikovskiy, Georgia State UniversityFollow

Document Type

Article

Publication Date

2011

Abstract

Background: Massively parallel whole transcriptome sequencing, commonly referred as RNA-Seq, is quickly becoming the technology of choice for gene expression profiling. However, due to the short read length delivered by current sequencing technologies, estimation of expression levels for alternative splicing gene isoforms remains challenging.

Results: In this paper we present a novel expectation-maximization algorithm for inference of isoform- and genespecific expression levels from RNA-Seq data. Our algorithm, referred to as IsoEM, is based on disambiguating information provided by the distribution of insert sizes generated during sequencing library preparation, and takes advantage of base quality scores, strand and read pairing information when available. The open source Java implementation of IsoEM is freely available at http://dna.engr.uconn.edu/software/IsoEM/.

Conclusions: Empirical experiments on both synthetic and real RNA-Seq datasets show that IsoEM has scalable running time and outperforms existing methods of isoform and gene expression level estimation. Simulation experiments confirm previous findings that, for a fixed sequencing cost, using reads longer than 25-36 bases does not necessarily lead to better accuracy for estimating expression levels of annotated isoforms and genes.

Comments

This article was originally published in Algorithms for Molecular Biology.

© 2011 Nicolae et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Recommended Citation

Nicolae et al.: Estimation of alternative splicing isoform frequencies from RNA-Seq data. Algorithms for Molecular Biology 2011 6:9. doi: 10.1186/1748-7188-6-9

Creative Commons License

This work is licensed under a Creative Commons Attribution 3.0 License.

Download

Included in

Computer Sciences Commons, Molecular Biology Commons

COinS

Computer Science Faculty Publications

Title

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Creative Commons License

Included in

Share

Browse

Authors