Skip to main content

Evaluation of rRNA depletion methods for capturing the RNA virome from environmental surfaces

Abstract

Objective

Metatranscriptomic analysis of RNA viromes on built-environment surfaces is hampered by low RNA yields and high abundance of rRNA. Therefore, we evaluated the quality of libraries, efficiency of rRNA depletion, and viral detection sensitivity using a mock community and a melamine-coated table surface RNA with levels below those required (< 5 ng) with a library preparation kit (NEBNext Ultra II Directional RNA Library Prep Kit).

Results

Good-quality RNA libraries were obtained from 0.1 ng of mock community and table surface RNA by changing the adapter concentration and number of PCR cycles. Differences in the target species of the rRNA depletion method affected the community composition and sensitivity of virus detection. The percentage of viral occupancy in two replicates was 0.259 and 0.290% in both human and bacterial rRNA-depleted samples, a 3.4 and 3.8-fold increase compared with that for only bacterial rRNA-depleted samples. Comparison of SARS-CoV-2 spiked-in human rRNA and bacterial rRNA-depleted samples suggested that more SARS-CoV-2 reads were detected in bacterial rRNA-depleted samples. We demonstrated that metatranscriptome analysis of RNA viromes is possible from RNA isolated from an indoor surface (representing a built-environment surface) using a standard library preparation kit.

Peer Review reports

Introduction

A growing interest in the presence of microbes and viruses on built-environment surfaces [1] has encouraged the international consortium MetaSUB [2] (established in 2015) to investigate the urban microbiome on a global scale. Approximately 5,000 samples obtained from built-environment surfaces (such as subway stations) in 60 cities worldwide were subjected to metagenomic analyses [3]. Results revealed the presence of ~ 11,000 DNA viruses and indicated that the microbiome of built-environment surfaces is distinctly different from that of other environments, such as the human body and soil. The METACoV project, derived from this project, aimed to characterize the changes in the urban microbiome and RNA virome, which includes the SARS-CoV-2 virus, during the COVID-19 pandemic via shotgun metatranscriptomics (total RNA-seq) [4].

However, RNA virome analysis of built-environment surfaces is associated with several challenges. First, the amount of RNA obtained is markedly lower (in the range of picograms or less) than the input recommendations of standard library preparation kits (in the range of nanograms) owing to the extremely low biomass detected [5,6,7]. Standard protocols for library construction usually recommend starting with a nanogram order of total RNA. However, several researchers have successfully prepared high-quality libraries from lower amounts of input RNA (250–500 pg) than those recommended by the manufacturers [8, 9]. Second, since most shotgun DNA sequences on built-environment surfaces are of bacterial origin, followed by eukaryotes such as fungi and humans [5, 6], abundant rRNA might reduce the sensitivity of RNA virome analysis. Several studies that detected SARS-CoV-2 in human clinical specimens via metatranscriptomics reported that human or bacterial rRNA depletion can improve viral genome detection [10,11,12,13,14]. Owing to the issues and sequencing costs of RNA analysis, no reports of metatranscriptomic RNA virome analysis from built-environment surfaces are currently available, to our knowledge.

Therefore, in this pilot study, we evaluated the impact of rRNA depletion methods on the quality of sequencing libraries, community composition, and sensitivity of virus detection using mock community RNA (0.1–10 ng) and a melamine-coated table surface RNA (< 1 ng), which had RNA amounts below the recommended input RNA amount (< 5 ng) for the library preparation kit used here. We also investigated the detection limit of SARS-CoV-2 using serial dilutions of synthetic viral RNA spiked into RNA samples from the table surface.

Materials and methods

Please refer to Additional file 1 for sample collection, RNA extraction, rRNA depletion, library preparation and sequencing, and bioinformatics methods. The sampling, RNA extraction, and pooling strategy is illustrated in Fig. 1.

Fig. 1
figure 1

The sampling, RNA extraction, and pooling strategy. Underlines correspond to sample names in the main text. Figure created with BioRender.com

Results and discussion

Effects of mock community RNA input amount on library preparation

We used a combination of NEBNext rRNA Depletion Kit for rRNA depletion and NEBNext Ultra II Directional RNA Library Prep Kit for metatranscriptomic library preparation. The minimum recommended combined amount of total RNA for these kits is 5 ng. To test the feasibility of library preparation from RNA amounts below the protocol requirements, total RNA extracted from ZymoBIOMICS Microbial Community Standard cells was serially diluted to 10, 1.0, and 0.1 ng, and rRNA depletion and library preparation were performed. Since this mock community primarily consisted of bacteria, rRNA depletion kits for human RNA were used to preserve the abundant bacterial rRNA.

All three samples (10, 1, and 0.1 ng) were successfully prepared for library construction and sequenced. Adapter concentration and number of PCR cycles were optimally adjusted, resulting in high-quality reads (Additional file 2: Table S1). Read duplication rates increased with decreasing amounts of input RNA. Higher duplicate rates indicate lower RNA complexity in the samples, which can reflect the presence of abundant bacterial rRNA that was not depleted (Additional file 2: Table S1) and the insufficient amount of input RNA, especially in the 0.1 ng sample. We subsequently assessed the bias of reduced RNA input on the species relative abundance estimate, and no clear differences in taxonomic compositions were observed among the three samples (Additional file 2: Table S2). Since information on the RNA composition ratios of each species in the mock community was not provided by the manufacturer, comparisons with theoretical content were not possible. However, when results for 10 ng RNA samples were compared with those for the other samples, no marked differences were observed, except for a slight difference in the percentage of Bacillus subtilis. This indicates that the combined use of the rRNA depletion kit and library preparation kit can produce high-quality libraries, even when the RNA input is reduced to 0.1 ng (100 pg). Our results are consistent with the 250–500 pg of RNA input reported previously [8, 9].

Library preparation from table surface samples, rRNA depletion efficiency, and community composition

To evaluate the feasibility of library preparation from built-environment surface RNA samples with less than the recommended RNA amount for the library preparation kit, we used RNA samples from a melamine-coated table used daily by students. Simultaneously, we evaluated the impact of human or bacterial rRNA depletion on community composition and virus detection sensitivity. We spiked serial dilutions of synthetic SARS-CoV-2 RNA (101–105 copies) into RNA samples from the table, followed by rRNA depletion of either human or bacterial rRNA, and then performed library preparation. Abundant bacteria were assumed to be present on the table surface [5, 6]; therefore, bacterial rRNA depletion was performed for five samples representing the full range of spike-in levels evaluated. An additional human rRNA-depleted sample spiked with 105 copies of SARS-CoV-2 was also prepared.

Although the RNA concentrations of the table surface samples were below the limit of detection of Qubit RNA HS Assay Kit (0.25 ng/µL), all samples (except the negative control) allowed generation of sufficient libraries for sequencing (Table 1). Thus, the library concentration can be used as a proxy for the RNA concentration initially extracted from each sample, and since the library concentration obtained from the table sample is approximately three times higher than that obtained from 0.1 ng of mock RNA (Additional file 2: Table S1 and Table 1), the amount of RNA in the table sample was estimated to be ~ 0.3 ng.

Table 1 Library preparation and sequencing summary metrics

All samples, including negative controls, were successfully sequenced (Table 1). The average percentage of high-quality reads that passed quality filtering was 97.1%, with read duplication rates of 55.4–61.9%. The rRNA read percentages showed clear differences between the two depletion methods. The percentage of eukaryotic rRNA reads in both human and bacterial rRNA-depleted samples averaged 4.4% and 88.5%, respectively. Conversely, the percentage of bacterial rRNA reads in human and bacterial rRNA-depleted samples averaged 86.3% and 0.9%, respectively.

We evaluated the effect of different rRNA depletion methods on community composition (Table 2). The estimated relative abundance of eukaryotes in human and bacterial rRNA-depleted samples averaged 19.9% and 89.1%, respectively, while that of bacteria in human and bacterial rRNA-depleted samples averaged 79.8% and 10.6%, respectively, a percentage of organisms that corresponds to that for the rRNA depletion method. When compared with that of the non-spiked samples (TH vs. TB samples), the estimated viral occupancy of bacterial rRNA-depleted samples was slightly higher than that of human rRNA-depleted samples (Table 2, TH: 0.061% vs. TB: 0.076%). Even after rRNA depletion using either method, the percentage of non-rRNA reads was ~ 10% (Table 1).

Table 2 Sequence read classification and mapping to the SARS-CoV-2 genome

We then focused on the community composition of the table surface (Fig. 2 and Additional file 2: Table S3). In human rRNA-depleted samples, the commensal skin bacterium Cutibacterium acnes was the most abundant. This is likely because table surfaces frequently come in contact with the human skin. In contrast, in bacterial rRNA-depleted samples, mites and nematodes (Tyrophagus putrescentiae, Poikilolaimus oxycercus) and seaweed (Ulva expansa/prolifera) were the most abundant. Since seaweed is commonly used in Japanese foods, it may have originated from meal contaminants present on table surfaces.

Fig. 2
figure 2

Sankey diagrams of the community composition of the table surface. The number of reads is proportional to the flow width. The corresponding number of each node represents the k-mer hit number. Domain (D), kingdom (K), phylum (P), class (C), order (O), family (F), genus (G), or species (S) was used as the rank code. TH, table surface sample depleted of human rRNA; TB, table surface sample depleted of bacterial rRNA

This demonstrates that metatranscriptome analysis may be performed on RNA from table surfaces using a standard library preparation kit. Additionally, differences in target species for the rRNA depletion method affected community composition and viral detection sensitivity.

Effects of rRNA depletion on sensitivity of SARS-CoV-2 RNA spiked into RNA samples

To assess the detection limit of SARS-CoV-2 RNA, we evaluated the number of SARS-CoV-2 reads obtained for the depleted samples of spiked human and bacterial rRNA using a k-mer (Kraken2–Bracken) and an alignment-based (BWA-MEM) approach. Kraken2–Bracken results indicated that no reads were detected in the negative control and non-spiked samples; however, SARS-CoV-2 reads were detected at > 103 copies in bacterial rRNA-depleted samples, with the number of reads increasing almost 10 fold with an increase in copy number (Table 2). When compared with samples with 105 copies of SARS-CoV-2 spiked-in (TH-s100k vs. TB-s100k), the bacterial rRNA-depleted sample (TB-s100k) showed a ~ 5 fold increase in the number of SARS-CoV-2 reads compared with the human rRNA-depleted sample (TH-s100k, Bracken reads: 65,368 vs. 12,693).

While Kraken2-Bracken did not detect any reads classified as SARS-CoV-2 in negative controls and non-spiked samples, BWA-MEM results revealed 81–1,437 reads mapped to the SARS-CoV-2 reference genome in those samples, signifying false positive results. Majority of these reads mapped to a polyA-containing region in the 3′ UTR of low complexity (Additional file 2: Table S4), suggesting that they were false positives. Similarly, 72 (TB-s0.01 k) and 86 (TB-s0.1 k) reads detected with samples spiked-in with less than 103 copies were attributed to false positive results. Compared with those in the samples with 105 copies of SARS-CoV-2 (TH-s100k vs. TB-s100k), SARS-CoV-2 reads were slightly higher in the bacterial rRNA-depleted sample, although the difference was smaller than that obtained using Kraken2-Bracken. When comparing bacterial rRNA-depleted samples, SARS-CoV-2 genome coverage was 43.3% and 98.0% for the 103 (TB-s1k) and 104 (TB-s10k) copies of SARS-CoV-2 spiked-in samples, respectively (Table 2 and Additional file 3: Figure S1). The average depth of genome coverage was 2.7 × and 28.2 × for the 103 (TB-s1k) and 104 (TB-s10k) copies of SARS-CoV-2 spiked-in samples, respectively.

In summary, at least 104 copies are required to obtain the average genome coverage depth (~ 30 ×) and genome coverage (> 98%) needed to detect mutations in the SARS-CoV-2 genome in this experimental system, and the detection limit for SARS-CoV-2 RNA is 103 copies. This value corresponds to that reported in previous studies on SARS-CoV-2 surface swabbing using qRT-PCR [15]. Although our study used synthetic RNA and theirs used inactivated viral particles, they reported that a minimum of 1,000 viable viral particles per 25 cm2 surface is required to ensure successful virus recovery and detection.

Efficiency of simultaneous depletion of both human and bacterial rRNA

Depletion of both human- and bacteria-derived rRNA may further increase the number of virus-derived reads; therefore, we employed simultaneous depletion of both human and bacterial rRNA (see Materials and Methods). The percentage of non-rRNA reads was 61.6% (THB1) and 62.6% (THB2) in the simultaneously depleted samples (Additional file 2: Table S5), markedly higher than that in samples depleted of either human (9.9%) or bacterial (9.6%) rRNA (Table 1). The percentage of viral occupancy was 0.259% (THB1) and 0.290% (THB2) in the simultaneously depleted samples (Additional file 2: Table S5), a 3.4-fold (THB1) and 3.8-fold (THB2) increase compared to that in the bacterial rRNA-depleted samples (0.076%; Table 2).

This demonstrates that simultaneous depletion of both human- and bacteria-derived rRNAs further increased the number of virus-derived reads. However, human RNA was not the only major source of eukaryotic RNA on the table surface (Additional file 3: Figure S2), suggesting that depletion of rRNAs of non-human eukaryotic origin is important for further improving virus detection sensitivity.

Conclusions

Here, we demonstrated that metatranscriptome analysis can be performed on RNA from table surfaces, representative of built-environment surfaces, using a standard library preparation kit by changing the adapter concentration and number of PCR cycles. The rRNA depletion method performed also influenced community composition and virus detection sensitivity.

Limitations

The built-environment surface targeted here was limited to one table, and the effects of rRNA depletion on detecting a small number of virus-derived reads might be considerably different with other surfaces. Owing to the limited amount of table surface-derived RNA that could be prepared, verifying reproducibility under each condition was difficult. Additionally, because the viral synthetic RNA used was spiked into the sample RNA after extraction, the detection sensitivity of the viral synthetic RNA may have been biased toward high sensitivity, compared to more realistic methods that absorb synthetic RNA into the swab during sampling.

Availability of data and materials

The sequenced raw reads used for this study were deposited in the DNA Data Bank of Japan Sequence Read Archive (DRA), the National Center for Biotechnology Information Sequence Read Archive (SRA), and the European Bioinformatics Institute Sequence Read Archive (ERA) under the accession numbers DRA014951 and DRA015134.

Abbreviations

DRS:

DNA/RNA shield

NTC:

No template control

References

  1. Gilbert JA, Stephens B. Microbiology of the built environment. Nat Rev Microbiol. 2018;16:661–70. https://doi.org/10.1038/s41579-018-0065-5.

    Article  CAS  PubMed  Google Scholar 

  2. The Metagenomics and Metadesign of the Subways and Urban Biomes (MetaSUB) International Consortium. http://metasub.org. Accessed 5 Aug 2022.

  3. Danko D, Bezdan D, Afshin EE, Ahsanuddin S, Bhattacharya C, Butler DJ, et al. A global metagenomic map of urban microbiomes and antimicrobial resistance. Cell. 2021;184:3376-93.e17. https://doi.org/10.1016/j.cell.2021.05.002.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. MetaCoV: RNA/COVID-19 ENVIRONMENTAL SAMPLING. http://metasub.org/projects/. Accessed 5 Aug 2022.

  5. Hsu T, Joice R, Vallarino J, Abu-Ali G, Hartmann EM, Shafquat A, et al. Urban transit system microbial communities differ by surface type and interaction with humans and the environment. mSystems. 2016;1:e00018-e116. https://doi.org/10.1128/mSystems.00018-16.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Afshinnekoo E, Meydan C, Chowdhury S, Jaroudi D, Boyer C, Bernstein N, et al. Geospatial resolution of human and bacterial diversity with city-scale metagenomics. Cell Syst. 2015;1:72–87. https://doi.org/10.1016/j.cels.2015.01.001.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Prussin AJ, Belser JA, Bischoff W, Kelley ST, Lin K, Lindsley WG, et al. Viruses in the built environment (VIBE) meeting report. Microbiome. 2020;8:1. https://doi.org/10.1186/s40168-019-0777-4.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Shanker S, Paulson A, Edenberg HJ, Peak A, Perera A, Alekseyev YO, et al. Evaluation of commercially available RNA amplification kits for RNA sequencing using very low input amounts of total RNA. J Biomol Tech. 2015;26:4–18. https://doi.org/10.7171/jbt.15-2601-001.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Song Y, Milon B, Ott S, Zhao X, Sadzewicz L, Shetty A, et al. A comparative analysis of library prep approaches for sequencing low input translatome samples. BMC Genomics. 2018;19:696. https://doi.org/10.1186/s12864-018-5066-2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. John G, Sahajpal NS, Mondal AK, Ananth S, Williams C, Chaubey A, et al. Next-generation sequencing (NGS) in COVID-19: A tool for SARS-CoV-2 diagnosis, monitoring new strains and phylodynamic modeling in molecular epidemiology. Curr Issues Mol Biol. 2021;43:845–67. https://doi.org/10.3390/cimb43020061.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Campos GS, Sardi SI, Falcao MB, Belitardo EMMA, Rocha DJPG, Rolo CA, et al. Ion torrent-based nasopharyngeal swab metatranscriptomics in COVID-19. J Virol Methods. 2020;282:113888. https://doi.org/10.1016/j.jviromet.2020.113888.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Meng Y, Xiao L, Chen W, Zhao F, Zhao X. An efficient metatranscriptomic approach for capturing RNA virome and its application to SARS-CoV-2. J Genet Genomics. 2021;48:860–2. https://doi.org/10.1016/j.jgg.2021.08.005.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Chan AP, Siddique A, Desplat Y, Choi Y, Ranganathan S, Choudhary KS, et al. A universal day zero infectious disease testing strategy leveraging CRISPR-based sample depletion and metagenomic sequencing. Medrxiv. 2022. https://doi.org/10.1101/2022.05.12.22274799.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Liu T, Chen Z, Chen W, Chen X, Hosseini M, Yang Z, et al. A benchmarking study of SARS-CoV-2 whole-genome sequencing protocols using COVID-19 patient samples. iScience. 2021;24:102892. https://doi.org/10.1016/j.isci.2021.102892.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Parker CW, Singh N, Tighe S, Blachowicz A, Wood JM, Seuylemezian A, et al. End-to-end protocol for the detection of SARS-CoV-2 from built environments. mSystems. 2020;5:e00771-e820. https://doi.org/10.1128/mSystems.00771-20.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank Mr. Reo Itabashi for his assistance with sampling and nucleic acid extraction. Sequencing was undertaken at the NODAI Genome Research Center at Tokyo University of Agriculture (http://www.nodai-genome.org/?lang=en).

Funding

This work was supported by the Japan Science and Technology Agency (JST), CREST Grant Number JPMJCR20H1, Japan.

Author information

Authors and Affiliations

Authors

Contributions

YS, BT, and HS contributed to the study design. YS performed data collection and analysis and drafted the manuscript. BT, MS, JK, CM, and HS edited the manuscript. All authors read and approved the final version of the manuscript.

Corresponding authors

Correspondence to Yuh Shiwa or Haruo Suzuki.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors have no competing interests to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1:

Materials and methods

Additional file 2: Table S1

Library preparation and sequencing summary metrics of the mock community samples. Table S2 Species relative abundance of the mock community samples estimated using Bracken. Table S3 Species relative abundance of the table and negative control samples estimated using Bracken. Table S4 The number of reads and coverage for each SARS-CoV-2 genomic feature (e.g., genes, 3′ UTR). Table S5 Library preparation and sequencing summary metrics of both human and bacterial rRNA-depleted samples.

Additional file 3: Figure S1

Graphical view of reads mapped to the reference genome of SARS-CoV-2. The blue tracks represent the SARS-CoV-2 reference strain, and the subsequent four tracks represent Bam files of bacterial rRNA-depleted samples spiked with synthetic RNA of SARS-CoV-2 (s100k, 105; s10k, 104; s1k, 103; s0.1k, 102; copies of SARS-CoV-2 RNA). Figure S2 Sankey diagrams of the Kraken 2 report based on human and bacterial rRNA-depleted samples (THB1 and THB2).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shiwa, Y., Baba, T., Sierra, M.A. et al. Evaluation of rRNA depletion methods for capturing the RNA virome from environmental surfaces. BMC Res Notes 16, 142 (2023). https://doi.org/10.1186/s13104-023-06417-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13104-023-06417-9

Keywords