Effect of RNA integrity on uniquely mapped reads in RNA-Seq

Chen, Emily A; Souaiaia, Tade; Herstein, Jennifer S; Evgrafov, Oleg V; Spitsyna, Valeria N; Rebolini, Danea F; Knowles, James A

doi:10.1186/1756-0500-7-753

Correspondence
Open access
Published: 23 October 2014

Effect of RNA integrity on uniquely mapped reads in RNA-Seq

Emily A Chen¹,
Tade Souaiaia²,
Jennifer S Herstein²,
Oleg V Evgrafov²,
Valeria N Spitsyna²,
Danea F Rebolini³ &
…
James A Knowles²

BMC Research Notes volume 7, Article number: 753 (2014) Cite this article

3210 Accesses
15 Citations
17 Altmetric
Metrics details

Abstract

Background

We examined the performance of three RNA-Sequencing library preparation protocols as a function of RNA integrity, comparing gene expressions between heat-degraded samples to their high-quality counterparts. This work is invaluable given the difficulty of obtaining high-quality RNA from tissues, particularly those from individuals with disease phenotypes.

Results

With the integrity of total RNA being a critical parameter for RNA-Sequencing analysis, degraded RNA can heavily influence the results of gene expression profiles. We discovered that gene expression read results are influenced by RNA quality when a common library construction protocol is used. These results are based on one technical experiment from a pool of 4 neural progenitor cell lines.

Conclusions

The use of alternative protocols can allow samples with a wider range of RNA qualities to be used, facilitating the investigation of disease tissues.

Discussion

Background

Adiconis et al.[1] examined the performance of five RNA-Seq sample preparation protocols when using RNA of low quality and/or quantity. This work is invaluable given the difficulty of obtaining high-quality RNA from tissues, particularly those from individuals with disease phenotypes. We have used a similar approach of evaluating the performance of RNA-Seq library preparation protocols, as a function of RNA integrity. We compared gene expression, as measured by RNA-Seq, of heat-degraded RNA samples to the expression profiles of the high-quality starting samples.

Methods and results

Specifically, 20 ug of high-quality total RNA (RIN 9.4; 2100 Bioanalyzer, Agilent Technologies Inc., Santa Clara, CA, USA) was constructed by pooling RNA extracted using a Direct-zol RNA MiniPrep kit (Zymo Research, Irvine, CA, USA) from neural progenitor cell lines made from 4 individuals [2]. This pool was heat-degraded (60 minutes at 60°C, followed by 6, 20 and 30 mins at 90°C) to RINs of 7.4, 5.3, and 4.5 [3]. RNA-Seq libraries were then made using three different protocols. 1) Poly-A RNA was purified from 1 ug of total RNA using oligo-dT beads, fragmented with divalent cations, made into cDNA and then sequencing libraries using the TruSeq RNA Sample Preparation kit v2 (RS-122-2001, Illumina Inc., San Diego, CA, USA). 2) Ribosomal RNA was removed from 1 ug of total RNA using the Ribo-Zero rRNA Removal kit (MRZH116, Epicentre Biotechnologies, Madison, WI, USA), and processed without the poly-A selection as per #1. 3) cDNA was made from 200 ng of total RNA using the Ovation RNA-Seq FFPE System (7150, NuGEN Technologies Inc., San Carlos, CA, USA), sheared to 300 bp using a Covaris S2 (500003, Covaris Inc., Woburn, MA, USA), and followed by library construction using the TruSeq DNA Sample Preparation kit v2 (FC-121-2001).

Each DNA library was sequenced with 4.5–60 million 100 bp single-end reads on an Illumina HiSeq2000. The reads were uniquely mapped with three or fewer mismatches using PerM [4] to GENCODE v17. For each protocol, we calculated Pearson’s pairwise correlation coefficients (denoted by the letter R) between the degraded and high-quality sample across the HUGO genes which contained at least one read alignment in either sample. R was calculated and depicted in Figure 1 by taking the log of (reads plus an offset of 1). All three protocols performed well at RIN 7.4 (R = 0.958 to 0.984, s.e. = 0.001 to 0.002) (Figure 1, Table 1). However, as RNA quality decreased (RINs 5.3 and 4.5), protocol #1 produced data with lower correlations of gene expression to the intact sample (R = 0.533 and 0.366, s.e. =0.005). In contrast, both protocols #2 and #3 performed relatively well as RNA quality decreased (R = 0.951 to 0.967, s.e. = 0.002), with protocol #3 performing slightly better. For each RIN quality, we calculated R between the reads from each pair of protocols. The reads from the two best methods (Protocol #2 and Protocol #3) maintained high correlations regardless of decreased sample quality (R = 0.845 to 0.879, s.e. = 0.003). For Protocol #1, there was a drop in read correlation to both Protocol #2 and Protocol #3 as RIN decreased (Figure 2).

Table 1 R between degraded sample and intact sample for each protocol

Full size table

For confirmation of mapper accuracy, we mapped all of the samples using TopHat v1.4.0 [5] to GENCODE v17. The resulting BAM files were run through HTSeq v0.6.1 [6] to obtain uniquely mapped read counts. Essentially the same results were obtained as with PerM (data not shown). Additionally, to rule out any bias from differences in numbers of reads, we downsampled all of the samples to 4.5 million reads, and the results were essentially the same (data not shown).

Conclusions

It is likely that the poor performance of protocol #1 at lower RINs can be explained by the poly-A selection step. As RNA integrity decreases, less full length poly-A RNA is recovered, leading to a cDNA library that is increasingly 3′ biased. This is supported by analysis of the 5′ to 3′ read distribution of each library. Those from protocols #2 and #3 are essentially unchanged at decreasing RIN, while the distribution for samples from protocol #1 is severely 3′ biased by RIN 4.5 (data not shown).

We recognize that our results are based on a single experiment using an RNA pool from 4 neural progenitor cell lines and are not broadly applicable. Hence, other investigators may want to use this method to determine the effect of RNA integrity on RNA-Seq from their tissue source of interest.

In summary, our data show that the results of RNA-Seq are influenced by RNA quality with a widely-used cDNA/sequencing library construction protocol. However, this problem can be avoided with alternative protocols, allowing samples with a wider range of RNA qualities to be used, facilitating the investigation of disease tissues.

References

Adiconis X, Borges-Rivera D, Satija R, DeLuca DS, Busby MA, Berlin AM, Sivachenko A, Thompson DA, Wysoker A, Fennell T, Gnirke A, Pochet N, Regev A, Levin JZ: Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat Methods. 2013, 10: 623-629. 10.1038/nmeth.2483.
Article PubMed CAS Google Scholar
Evgrafov OV, Wrobel BB, Kang X, Simpson G, Malaspina D, Knowles JA: Olfactory neuroepithelium-derived neural progenitor cells as a model system for investigating the molecular mechanisms of neuropsychiatric disorders. Psychiatr Genet. 2011, 21: 217-228. 10.1097/YPG.0b013e328341a2f0.
Article PubMed Google Scholar
Opitz L, Salinas-Riester G, Grade M, Jung K, Jo P, Emons G, Ghadimi BM, Beissbarth T, Gaedcke J: Impact of RNA degradation on gene expression profiling. BMC Med Genomics. 2010, 3: 36-10.1186/1755-8794-3-36.
Article PubMed PubMed Central Google Scholar
Chen Y, Souaiaia T, Chen T: PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds. Bioinformatics. 2009, 25: 2514-2521. 10.1093/bioinformatics/btp486.
Article PubMed CAS PubMed Central Google Scholar
Trapnell C, Pachter L, Salzberg S: TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009, 25: 1105-1111. 10.1093/bioinformatics/btp120.
Article PubMed CAS PubMed Central Google Scholar
Anders S, Pyl PT, Huber W: HTSeq - A Python framework to work with high-throughput sequencing data. BioRxiv preprint. 2014,http://www.ncbi.nlm.nih.gov/pubmed/25260700,
Google Scholar

Download references

Acknowledgements

This work was supported in part by research grants from the NIMH (MH090047 and MH086874), and the NHGRI (HG006531).

Author information

Authors and Affiliations

Department of Genetic, Molecular, and Cellular Biology, Keck School of Medicine, Zilkha Neurogenetic Institute, University of Southern California, 1501 San Pablo Street, ZNI 401, MC 2821, Los Angeles, CA, 90089-2821, USA
Emily A Chen
Department of Psychiatry and the Behavioral Sciences, Keck School of Medicine, Zilkha Neurogenetic Institute, University of Southern California, 1501 San Pablo Street, ZNI 401, MC 2821, Los Angeles, CA, 90089-2821, USA
Tade Souaiaia, Jennifer S Herstein, Oleg V Evgrafov, Valeria N Spitsyna & James A Knowles
Illumina, Inc., 5200 Illumina Way, San Diego, CA, 92122, USA
Danea F Rebolini

Authors

Emily A Chen
View author publications
You can also search for this author in PubMed Google Scholar
Tade Souaiaia
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer S Herstein
View author publications
You can also search for this author in PubMed Google Scholar
Oleg V Evgrafov
View author publications
You can also search for this author in PubMed Google Scholar
Valeria N Spitsyna
View author publications
You can also search for this author in PubMed Google Scholar
Danea F Rebolini
View author publications
You can also search for this author in PubMed Google Scholar
James A Knowles
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Emily A Chen.

Additional information

Competing interests

DR is currently employed at Illumina, Inc. JK received an honorarium for speaking at an Illumina, Inc. event.

Authors’ contributions

EC carried out the sample preparation, data analysis and co-authored the manuscript. TS developed the mapping software. JH performed sequence alignment. OE advised on data analysis. VS extracted the RNA from the cell line. DR carried out the RNA sequencing. JK conceived the experiments, advised on data analysis and co-authored the manuscript. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Chen, E.A., Souaiaia, T., Herstein, J.S. et al. Effect of RNA integrity on uniquely mapped reads in RNA-Seq. BMC Res Notes 7, 753 (2014). https://doi.org/10.1186/1756-0500-7-753

Download citation

Received: 20 March 2014
Accepted: 14 October 2014
Published: 23 October 2014
DOI: https://doi.org/10.1186/1756-0500-7-753

Effect of RNA integrity on uniquely mapped reads in RNA-Seq

Abstract

Background

Results

Conclusions

Discussion

Background

Methods and results

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Rights and permissions

About this article

Cite this article

Keywords

BMC Research Notes

Contact us

Effect of RNA integrity on uniquely mapped reads in RNA-Seq

Abstract

Background

Results

Conclusions

Discussion

Background

Methods and results

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Research Notes

Contact us