- Research Note
- Open access
- Published:
Evaluation of rRNA depletion methods for capturing the RNA virome from environmental surfaces
BMC Research Notes volume 16, Article number: 142 (2023)
Abstract
Objective
Metatranscriptomic analysis of RNA viromes on built-environment surfaces is hampered by low RNA yields and high abundance of rRNA. Therefore, we evaluated the quality of libraries, efficiency of rRNA depletion, and viral detection sensitivity using a mock community and a melamine-coated table surface RNA with levels below those required (< 5 ng) with a library preparation kit (NEBNext Ultra II Directional RNA Library Prep Kit).
Results
Good-quality RNA libraries were obtained from 0.1 ng of mock community and table surface RNA by changing the adapter concentration and number of PCR cycles. Differences in the target species of the rRNA depletion method affected the community composition and sensitivity of virus detection. The percentage of viral occupancy in two replicates was 0.259 and 0.290% in both human and bacterial rRNA-depleted samples, a 3.4 and 3.8-fold increase compared with that for only bacterial rRNA-depleted samples. Comparison of SARS-CoV-2 spiked-in human rRNA and bacterial rRNA-depleted samples suggested that more SARS-CoV-2 reads were detected in bacterial rRNA-depleted samples. We demonstrated that metatranscriptome analysis of RNA viromes is possible from RNA isolated from an indoor surface (representing a built-environment surface) using a standard library preparation kit.
Introduction
A growing interest in the presence of microbes and viruses on built-environment surfaces [1] has encouraged the international consortium MetaSUB [2] (established in 2015) to investigate the urban microbiome on a global scale. Approximately 5,000 samples obtained from built-environment surfaces (such as subway stations) in 60 cities worldwide were subjected to metagenomic analyses [3]. Results revealed the presence of ~ 11,000 DNA viruses and indicated that the microbiome of built-environment surfaces is distinctly different from that of other environments, such as the human body and soil. The METACoV project, derived from this project, aimed to characterize the changes in the urban microbiome and RNA virome, which includes the SARS-CoV-2 virus, during the COVID-19 pandemic via shotgun metatranscriptomics (total RNA-seq) [4].
However, RNA virome analysis of built-environment surfaces is associated with several challenges. First, the amount of RNA obtained is markedly lower (in the range of picograms or less) than the input recommendations of standard library preparation kits (in the range of nanograms) owing to the extremely low biomass detected [5,6,7]. Standard protocols for library construction usually recommend starting with a nanogram order of total RNA. However, several researchers have successfully prepared high-quality libraries from lower amounts of input RNA (250–500 pg) than those recommended by the manufacturers [8, 9]. Second, since most shotgun DNA sequences on built-environment surfaces are of bacterial origin, followed by eukaryotes such as fungi and humans [5, 6], abundant rRNA might reduce the sensitivity of RNA virome analysis. Several studies that detected SARS-CoV-2 in human clinical specimens via metatranscriptomics reported that human or bacterial rRNA depletion can improve viral genome detection [10,11,12,13,14]. Owing to the issues and sequencing costs of RNA analysis, no reports of metatranscriptomic RNA virome analysis from built-environment surfaces are currently available, to our knowledge.
Therefore, in this pilot study, we evaluated the impact of rRNA depletion methods on the quality of sequencing libraries, community composition, and sensitivity of virus detection using mock community RNA (0.1–10 ng) and a melamine-coated table surface RNA (< 1 ng), which had RNA amounts below the recommended input RNA amount (< 5 ng) for the library preparation kit used here. We also investigated the detection limit of SARS-CoV-2 using serial dilutions of synthetic viral RNA spiked into RNA samples from the table surface.
Results and discussion
Effects of mock community RNA input amount on library preparation
We used a combination of NEBNext rRNA Depletion Kit for rRNA depletion and NEBNext Ultra II Directional RNA Library Prep Kit for metatranscriptomic library preparation. The minimum recommended combined amount of total RNA for these kits is 5 ng. To test the feasibility of library preparation from RNA amounts below the protocol requirements, total RNA extracted from ZymoBIOMICS Microbial Community Standard cells was serially diluted to 10, 1.0, and 0.1 ng, and rRNA depletion and library preparation were performed. Since this mock community primarily consisted of bacteria, rRNA depletion kits for human RNA were used to preserve the abundant bacterial rRNA.
All three samples (10, 1, and 0.1 ng) were successfully prepared for library construction and sequenced. Adapter concentration and number of PCR cycles were optimally adjusted, resulting in high-quality reads (Additional file 2: Table S1). Read duplication rates increased with decreasing amounts of input RNA. Higher duplicate rates indicate lower RNA complexity in the samples, which can reflect the presence of abundant bacterial rRNA that was not depleted (Additional file 2: Table S1) and the insufficient amount of input RNA, especially in the 0.1 ng sample. We subsequently assessed the bias of reduced RNA input on the species relative abundance estimate, and no clear differences in taxonomic compositions were observed among the three samples (Additional file 2: Table S2). Since information on the RNA composition ratios of each species in the mock community was not provided by the manufacturer, comparisons with theoretical content were not possible. However, when results for 10 ng RNA samples were compared with those for the other samples, no marked differences were observed, except for a slight difference in the percentage of Bacillus subtilis. This indicates that the combined use of the rRNA depletion kit and library preparation kit can produce high-quality libraries, even when the RNA input is reduced to 0.1 ng (100 pg). Our results are consistent with the 250–500 pg of RNA input reported previously [8, 9].
Library preparation from table surface samples, rRNA depletion efficiency, and community composition
To evaluate the feasibility of library preparation from built-environment surface RNA samples with less than the recommended RNA amount for the library preparation kit, we used RNA samples from a melamine-coated table used daily by students. Simultaneously, we evaluated the impact of human or bacterial rRNA depletion on community composition and virus detection sensitivity. We spiked serial dilutions of synthetic SARS-CoV-2 RNA (101–105 copies) into RNA samples from the table, followed by rRNA depletion of either human or bacterial rRNA, and then performed library preparation. Abundant bacteria were assumed to be present on the table surface [5, 6]; therefore, bacterial rRNA depletion was performed for five samples representing the full range of spike-in levels evaluated. An additional human rRNA-depleted sample spiked with 105 copies of SARS-CoV-2 was also prepared.
Although the RNA concentrations of the table surface samples were below the limit of detection of Qubit RNA HS Assay Kit (0.25 ng/µL), all samples (except the negative control) allowed generation of sufficient libraries for sequencing (Table 1). Thus, the library concentration can be used as a proxy for the RNA concentration initially extracted from each sample, and since the library concentration obtained from the table sample is approximately three times higher than that obtained from 0.1 ng of mock RNA (Additional file 2: Table S1 and Table 1), the amount of RNA in the table sample was estimated to be ~ 0.3 ng.
All samples, including negative controls, were successfully sequenced (Table 1). The average percentage of high-quality reads that passed quality filtering was 97.1%, with read duplication rates of 55.4–61.9%. The rRNA read percentages showed clear differences between the two depletion methods. The percentage of eukaryotic rRNA reads in both human and bacterial rRNA-depleted samples averaged 4.4% and 88.5%, respectively. Conversely, the percentage of bacterial rRNA reads in human and bacterial rRNA-depleted samples averaged 86.3% and 0.9%, respectively.
We evaluated the effect of different rRNA depletion methods on community composition (Table 2). The estimated relative abundance of eukaryotes in human and bacterial rRNA-depleted samples averaged 19.9% and 89.1%, respectively, while that of bacteria in human and bacterial rRNA-depleted samples averaged 79.8% and 10.6%, respectively, a percentage of organisms that corresponds to that for the rRNA depletion method. When compared with that of the non-spiked samples (TH vs. TB samples), the estimated viral occupancy of bacterial rRNA-depleted samples was slightly higher than that of human rRNA-depleted samples (Table 2, TH: 0.061% vs. TB: 0.076%). Even after rRNA depletion using either method, the percentage of non-rRNA reads was ~ 10% (Table 1).
We then focused on the community composition of the table surface (Fig. 2 and Additional file 2: Table S3). In human rRNA-depleted samples, the commensal skin bacterium Cutibacterium acnes was the most abundant. This is likely because table surfaces frequently come in contact with the human skin. In contrast, in bacterial rRNA-depleted samples, mites and nematodes (Tyrophagus putrescentiae, Poikilolaimus oxycercus) and seaweed (Ulva expansa/prolifera) were the most abundant. Since seaweed is commonly used in Japanese foods, it may have originated from meal contaminants present on table surfaces.
This demonstrates that metatranscriptome analysis may be performed on RNA from table surfaces using a standard library preparation kit. Additionally, differences in target species for the rRNA depletion method affected community composition and viral detection sensitivity.
Effects of rRNA depletion on sensitivity of SARS-CoV-2 RNA spiked into RNA samples
To assess the detection limit of SARS-CoV-2 RNA, we evaluated the number of SARS-CoV-2 reads obtained for the depleted samples of spiked human and bacterial rRNA using a k-mer (Kraken2–Bracken) and an alignment-based (BWA-MEM) approach. Kraken2–Bracken results indicated that no reads were detected in the negative control and non-spiked samples; however, SARS-CoV-2 reads were detected at > 103 copies in bacterial rRNA-depleted samples, with the number of reads increasing almost 10 fold with an increase in copy number (Table 2). When compared with samples with 105 copies of SARS-CoV-2 spiked-in (TH-s100k vs. TB-s100k), the bacterial rRNA-depleted sample (TB-s100k) showed a ~ 5 fold increase in the number of SARS-CoV-2 reads compared with the human rRNA-depleted sample (TH-s100k, Bracken reads: 65,368 vs. 12,693).
While Kraken2-Bracken did not detect any reads classified as SARS-CoV-2 in negative controls and non-spiked samples, BWA-MEM results revealed 81–1,437 reads mapped to the SARS-CoV-2 reference genome in those samples, signifying false positive results. Majority of these reads mapped to a polyA-containing region in the 3′ UTR of low complexity (Additional file 2: Table S4), suggesting that they were false positives. Similarly, 72 (TB-s0.01 k) and 86 (TB-s0.1 k) reads detected with samples spiked-in with less than 103 copies were attributed to false positive results. Compared with those in the samples with 105 copies of SARS-CoV-2 (TH-s100k vs. TB-s100k), SARS-CoV-2 reads were slightly higher in the bacterial rRNA-depleted sample, although the difference was smaller than that obtained using Kraken2-Bracken. When comparing bacterial rRNA-depleted samples, SARS-CoV-2 genome coverage was 43.3% and 98.0% for the 103 (TB-s1k) and 104 (TB-s10k) copies of SARS-CoV-2 spiked-in samples, respectively (Table 2 and Additional file 3: Figure S1). The average depth of genome coverage was 2.7 × and 28.2 × for the 103 (TB-s1k) and 104 (TB-s10k) copies of SARS-CoV-2 spiked-in samples, respectively.
In summary, at least 104 copies are required to obtain the average genome coverage depth (~ 30 ×) and genome coverage (> 98%) needed to detect mutations in the SARS-CoV-2 genome in this experimental system, and the detection limit for SARS-CoV-2 RNA is 103 copies. This value corresponds to that reported in previous studies on SARS-CoV-2 surface swabbing using qRT-PCR [15]. Although our study used synthetic RNA and theirs used inactivated viral particles, they reported that a minimum of 1,000 viable viral particles per 25 cm2 surface is required to ensure successful virus recovery and detection.
Efficiency of simultaneous depletion of both human and bacterial rRNA
Depletion of both human- and bacteria-derived rRNA may further increase the number of virus-derived reads; therefore, we employed simultaneous depletion of both human and bacterial rRNA (see Materials and Methods). The percentage of non-rRNA reads was 61.6% (THB1) and 62.6% (THB2) in the simultaneously depleted samples (Additional file 2: Table S5), markedly higher than that in samples depleted of either human (9.9%) or bacterial (9.6%) rRNA (Table 1). The percentage of viral occupancy was 0.259% (THB1) and 0.290% (THB2) in the simultaneously depleted samples (Additional file 2: Table S5), a 3.4-fold (THB1) and 3.8-fold (THB2) increase compared to that in the bacterial rRNA-depleted samples (0.076%; Table 2).
This demonstrates that simultaneous depletion of both human- and bacteria-derived rRNAs further increased the number of virus-derived reads. However, human RNA was not the only major source of eukaryotic RNA on the table surface (Additional file 3: Figure S2), suggesting that depletion of rRNAs of non-human eukaryotic origin is important for further improving virus detection sensitivity.
Conclusions
Here, we demonstrated that metatranscriptome analysis can be performed on RNA from table surfaces, representative of built-environment surfaces, using a standard library preparation kit by changing the adapter concentration and number of PCR cycles. The rRNA depletion method performed also influenced community composition and virus detection sensitivity.
Limitations
The built-environment surface targeted here was limited to one table, and the effects of rRNA depletion on detecting a small number of virus-derived reads might be considerably different with other surfaces. Owing to the limited amount of table surface-derived RNA that could be prepared, verifying reproducibility under each condition was difficult. Additionally, because the viral synthetic RNA used was spiked into the sample RNA after extraction, the detection sensitivity of the viral synthetic RNA may have been biased toward high sensitivity, compared to more realistic methods that absorb synthetic RNA into the swab during sampling.
Availability of data and materials
The sequenced raw reads used for this study were deposited in the DNA Data Bank of Japan Sequence Read Archive (DRA), the National Center for Biotechnology Information Sequence Read Archive (SRA), and the European Bioinformatics Institute Sequence Read Archive (ERA) under the accession numbers DRA014951 and DRA015134.
Abbreviations
- DRS:
-
DNA/RNA shield
- NTC:
-
No template control
References
Gilbert JA, Stephens B. Microbiology of the built environment. Nat Rev Microbiol. 2018;16:661–70. https://doi.org/10.1038/s41579-018-0065-5.
The Metagenomics and Metadesign of the Subways and Urban Biomes (MetaSUB) International Consortium. http://metasub.org. Accessed 5 Aug 2022.
Danko D, Bezdan D, Afshin EE, Ahsanuddin S, Bhattacharya C, Butler DJ, et al. A global metagenomic map of urban microbiomes and antimicrobial resistance. Cell. 2021;184:3376-93.e17. https://doi.org/10.1016/j.cell.2021.05.002.
MetaCoV: RNA/COVID-19 ENVIRONMENTAL SAMPLING. http://metasub.org/projects/. Accessed 5 Aug 2022.
Hsu T, Joice R, Vallarino J, Abu-Ali G, Hartmann EM, Shafquat A, et al. Urban transit system microbial communities differ by surface type and interaction with humans and the environment. mSystems. 2016;1:e00018-e116. https://doi.org/10.1128/mSystems.00018-16.
Afshinnekoo E, Meydan C, Chowdhury S, Jaroudi D, Boyer C, Bernstein N, et al. Geospatial resolution of human and bacterial diversity with city-scale metagenomics. Cell Syst. 2015;1:72–87. https://doi.org/10.1016/j.cels.2015.01.001.
Prussin AJ, Belser JA, Bischoff W, Kelley ST, Lin K, Lindsley WG, et al. Viruses in the built environment (VIBE) meeting report. Microbiome. 2020;8:1. https://doi.org/10.1186/s40168-019-0777-4.
Shanker S, Paulson A, Edenberg HJ, Peak A, Perera A, Alekseyev YO, et al. Evaluation of commercially available RNA amplification kits for RNA sequencing using very low input amounts of total RNA. J Biomol Tech. 2015;26:4–18. https://doi.org/10.7171/jbt.15-2601-001.
Song Y, Milon B, Ott S, Zhao X, Sadzewicz L, Shetty A, et al. A comparative analysis of library prep approaches for sequencing low input translatome samples. BMC Genomics. 2018;19:696. https://doi.org/10.1186/s12864-018-5066-2.
John G, Sahajpal NS, Mondal AK, Ananth S, Williams C, Chaubey A, et al. Next-generation sequencing (NGS) in COVID-19: A tool for SARS-CoV-2 diagnosis, monitoring new strains and phylodynamic modeling in molecular epidemiology. Curr Issues Mol Biol. 2021;43:845–67. https://doi.org/10.3390/cimb43020061.
Campos GS, Sardi SI, Falcao MB, Belitardo EMMA, Rocha DJPG, Rolo CA, et al. Ion torrent-based nasopharyngeal swab metatranscriptomics in COVID-19. J Virol Methods. 2020;282:113888. https://doi.org/10.1016/j.jviromet.2020.113888.
Meng Y, Xiao L, Chen W, Zhao F, Zhao X. An efficient metatranscriptomic approach for capturing RNA virome and its application to SARS-CoV-2. J Genet Genomics. 2021;48:860–2. https://doi.org/10.1016/j.jgg.2021.08.005.
Chan AP, Siddique A, Desplat Y, Choi Y, Ranganathan S, Choudhary KS, et al. A universal day zero infectious disease testing strategy leveraging CRISPR-based sample depletion and metagenomic sequencing. Medrxiv. 2022. https://doi.org/10.1101/2022.05.12.22274799.
Liu T, Chen Z, Chen W, Chen X, Hosseini M, Yang Z, et al. A benchmarking study of SARS-CoV-2 whole-genome sequencing protocols using COVID-19 patient samples. iScience. 2021;24:102892. https://doi.org/10.1016/j.isci.2021.102892.
Parker CW, Singh N, Tighe S, Blachowicz A, Wood JM, Seuylemezian A, et al. End-to-end protocol for the detection of SARS-CoV-2 from built environments. mSystems. 2020;5:e00771-e820. https://doi.org/10.1128/mSystems.00771-20.
Acknowledgements
We thank Mr. Reo Itabashi for his assistance with sampling and nucleic acid extraction. Sequencing was undertaken at the NODAI Genome Research Center at Tokyo University of Agriculture (http://www.nodai-genome.org/?lang=en).
Funding
This work was supported by the Japan Science and Technology Agency (JST), CREST Grant Number JPMJCR20H1, Japan.
Author information
Authors and Affiliations
Contributions
YS, BT, and HS contributed to the study design. YS performed data collection and analysis and drafted the manuscript. BT, MS, JK, CM, and HS edited the manuscript. All authors read and approved the final version of the manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors have no competing interests to declare.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1:
Materials and methods
Additional file 2: Table S1
Library preparation and sequencing summary metrics of the mock community samples. Table S2 Species relative abundance of the mock community samples estimated using Bracken. Table S3 Species relative abundance of the table and negative control samples estimated using Bracken. Table S4 The number of reads and coverage for each SARS-CoV-2 genomic feature (e.g., genes, 3′ UTR). Table S5 Library preparation and sequencing summary metrics of both human and bacterial rRNA-depleted samples.
Additional file 3: Figure S1
Graphical view of reads mapped to the reference genome of SARS-CoV-2. The blue tracks represent the SARS-CoV-2 reference strain, and the subsequent four tracks represent Bam files of bacterial rRNA-depleted samples spiked with synthetic RNA of SARS-CoV-2 (s100k, 105; s10k, 104; s1k, 103; s0.1k, 102; copies of SARS-CoV-2 RNA). Figure S2 Sankey diagrams of the Kraken 2 report based on human and bacterial rRNA-depleted samples (THB1 and THB2).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Shiwa, Y., Baba, T., Sierra, M.A. et al. Evaluation of rRNA depletion methods for capturing the RNA virome from environmental surfaces. BMC Res Notes 16, 142 (2023). https://doi.org/10.1186/s13104-023-06417-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13104-023-06417-9