- Research note
- Open Access
Genomic characterization of bacteriophage BI-EHEC infecting strains of Enterohemorrhagic Escherichia coli
BMC Research Notes volume 14, Article number: 459 (2021)
The aims of this research were to determine the genomic properties of BI-EHEC to control Enterohemorrhagic Escherichia coli (EHEC), which was isolated from previous study. Genomic analysis of this phage is essential for the assessment of this bacteriophage for further application as food preservatives.
Genome of BI-EHEC was successfully annotated using multiPhATE2. Structural and lytic cycle-related proteins such as head, tail, capsid, and lysozyme (lysin) were annotated. The phylogenetic tree of tail fiber protein and BRIG results showed that BI-EHEC was similar to phages of the same host in the bacteriophage genome database. There were no indications of virulence properties, antibiotic resistance genes and lysogenic protein among annotated genes which implied BI-EHEC followed a lytic life cycle. PHACTS analysis was done to confirm this notion further and yielded a lytic cycle result. Further analysis using CARD found that BI-EHEC does not contain residual ARGs per recommended parameter. Furthermore, BI-EHEC confirmed as lytic bacteriophage, making it a good candidate for biocontrol agent.
Foodborne disease is often caused by consuming food contaminated by bacteria, one of the foodborne bacteria is EHEC. Conventional preservation methods has many disadvantages, such as the loss of nutritional and organoleptic value [1, 2]. Bacteriophage can be used as an alternative approach. Bacteriophage has two different life cycles. Lytic life cycle enables bacteriophages to lyse the bacterial host and create progeny, it is preferable to be used as biocontrol agent to minimize the probability for horizontal gene transfer. While lysogenic life cycle only enables DNA replication in the host. Lytic bacteriophage is .
Our previous study isolated bacteriophage from bovine intestine referred as BI-EHEC, found to be effective in controlling EHEC with 91.02% of reduction . However, genomic properties analysis of BI-EHEC has not been done. In this research, we would like to use in silico approach to determine that BI-EHEC possessed certain criteria as a promising candidate for biocontrol agent.
Bacteriophage enrichment and purification
EHEC were growth in Luria Bertani (LB) agar media (OXOID), incubated at 37 °C overnight, then stored in a refrigerator at 4 °C. Host bacteria were growth into LB broth media incubated in water bath shaker (Lab Companion) at 120 rpm, 37 °C overnight. BI-EHEC from stock solution was enriched by adding 1.63 ± 0.65 × 1010 PFU/mL of phage solution and 108 CFU/mL (OD600 = 0.132) of its host bacteria into a fresh LB broth media (OXOID). The suspension was then incubated using a water bath shaker at 120 rpm, 37 °C, overnight, then the suspension was centrifuged (Eppendorf) at 5488×g for 15 min. The pellet was removed, and the supernatant was taken to be filtered with a 0.22 μm microfilter (Himedia, Mumbai, India). The purified bacteriophage stock can be kept at 4 °C with the addition of Ringer Solution (OXOID) (1:9 v/v) for further steps. Additionally, agar overlay method was performed to verify the activity and presence of BI-EHEC by observing a clear plaque [5,6,7].
Isolation of bacteriophage genomic material
As much as 5 μL of DNase I (Geneaid) were added to 1.63 ± 0.65 × 1010 PFU/mL of purified bacteriophage, then incubated at 37 °C for 30 min. Then 6 μL of EDTA 0.05 M, 10 μL of 1% sodium dodecyl sulfate (SDS) and 6 μL of proteinase K (Geneaid) (10 mg/mL) were added. The mixture was incubated at 37 °C for 1 h. Then 600 μL of phenol–chloroform-isoamyl alcohol solution (25:24:1) was added and centrifuged (Thermo) at 2655g for 5 min. The upper phase was taken into a new microtube, mixed with 500 μL of chloroform-isoamyl alcohol solution (24:1), and centrifuged at 2655×g for 5 min. The upper phase was taken into a new microtube. A 3M sodium acetate pH 5.2 (1:10) solution followed by isopropyl alcohol (1:1) (MERCK) was added to the mixture and incubated in ice bath for 15 min. Then the suspension was centrifuged at 17,949×g for 10 min, and the supernatant was removed. About 700 μL of 70% of ethanol was added to the pellet, and the mixture was centrifuged again at 17,949×g for 10 min. The supernatant was removed, and the pellet was dried. A 50 μL of nuclease-free water (NFW) (Qiagen) solution was added to the pellet for DNA storage at 4 °C .
Next-generation sequencing (NGS)
gDNA sequences obtained from bacteriophage genomic isolation were sent to PT Genetika Science Indonesia for NGS using Oxford Nanopore Technologies (MinKNOW 20.06.9). Base Calling was done using Guppy 4.0.11 high accurate mode. Raw NGS data were filtered using Filtlong v.0.2.0, utilizing the default parameter without an external reference . De novo assembly was done with Flye v.2.8.3 using the default parameter for Oxford nanopore input  on the resulting Filtlong fasta. Medaka 1.2.0 (default parameter)  was used to polish the assembled genome. The resulting fasta was treated as the complete genome assembly for BI-EHEC.
Genome annotations were carried out with multiPhATE2, using default databases (Phantome, pVOGs) and supporting databases (NCBI virus genomes, NCBI Swissprot, CAZy) . A phylogenetic tree of tail fiber protein was constructed using MEGAX (nucleotide sequence) [13, 14]. BLAST analysis was carried out to determine the similarity BI-EHEC most resembles . Two additional bacteriophages were chosen from NCBI database to be compared with BI-EHEC using BRIG . Virulence (eae, lpf, stx) and lysogenic (int, xis) genes were also compared with BI-EHEC via BRIG. Further analysis was done using CARD  to study the possible presence of antimicrobial resistance genes (ARGs). PHACTS  was performed to determine the life cycle of BI-EHEC.
The BI-EHEC (GenBank accession number OL505078) is composed of 151.425 bp with 39% GC content. It has 12 encoded tRNA regions and 352 open reading frames (ORF). Genes associated with cell lysis, assembly, and packaging during the end of the lytic cycle were annotated. It includes putative T4-like lysozyme (EC 22.214.171.124), tail fiber assembly protein, gpH, and terminases. Other results include parts associated with bacteriophage structures [3, 19]. Complete annotation can be seen in Additional file 1: Table S1, genome map in Additional file 2: Figure S1 and selected results in Table 1.
Phylogenetic analysis of tail fiber protein
Sixteen tail fiber protein sequences were obtained from NCBI databases to be compared with BI-EHEC [see Additional file 3: Table S2]. BI-EHEC tail fiber protein showed high similarity with Escherichia phage ukendt tail fiber protein (Figs. 1, 2).
BLAST and BRIG
BLAST (BLASTn) analysis was performed for BI-EHEC and exhibited the highest similarity with Escherichia phage ESCO13. Escherichia phage anhysbys and ESCO13 of NCBI database were selected and served as a comparison genome for BRIG analysis.
PHACTS and CARD
Analysis using PHACTS was performed to confirm that BI-EHEC have lytic life cycle properties. The average probability produced by PHACTS for BI-EHEC is 0.519 with 0.05 standard deviation, non-confidently declared lytic bacteriophage by PHACTS. However, PHACTS have a high confidence rate (up to 99%) in determining phage lifestyle. According to McNair et al. 2012 , there is a high chance that non-confident prediction would yield an exact result as predicted.
CARD analyze a molecular sequence for predicting resistome based on homology and SNP models with perfect and strict parameters yielded zero results, and it was changed to loose hits to accommodate, the complete result can be observed in Additional file 4: Table S3, with TriC as the highest result. The loose hits algorithm can detect in lower similarity (< 95%) and more distant homologs of ARGs genes . However, it only yielded results with less than 95% similarity therefore it can mislabel unrelated genes as antibiotic-resistant genes.
Annotations using multiPhATE2 could annotate proteins which are necessary for the end of a lytic cycle or structural proteins (Table 1). It was also noted that among successfully annotated CDs, lysogenic genes were not able to be found.
The phage genome-packaging component itself consists of portal protein, small terminase and large terminase. Small terminase can be annotated using multiPhATE2 (Table 1), this protein is used to initiate genome packaging and regulating large terminase functions. Meanwhile, large terminase is important to cleave concatenated DNA molecules to initiate packaging mechanisms .
Assembly for phages is done separately for the head, the tail, and the long tail fibers before joining to form a mature phage [23, 24]. Both Tail fiber assembly (Tfa) and gpH were involved in the tail assembly. Tfa is a family of proteins play a role in folding phage fibers as chaperones and determining host range specificity. Tape important in measure protein gpH and determines the length of the phage tail [25, 26].
Putative T4-like lysozyme is a hydrolytic enzyme used to cleave peptidoglycan bonds, It is produced during the late stage of the lytic cycle when assembled phages are ready to be released to the environment. Lysin possesses two main domains, N-terminal functions as a catalytic domain while C-terminal serves as a binding domain that targets and binds to specific peptidoglycan ligands .
Tail fiber functions as a receptor-binding protein (RBP) in many bacteriophages. RBP plays a role in phage host recognition and its interaction with other phages of the same host. For T4-like phages, the C-terminal and N-terminal regions of tail fiber are important to determine the receptor specificity as well as host range [27, 28]. BI-EHEC tail fiber protein showed the closest with Escherichia phage ukendt with E. coli K-12 MG1655 as its host . Similar genetic make-up might contribute to different phages having the same host range. It is beneficial to study and observe a variety of tail fiber genes to expand knowledge of the host range used in phage cocktails .
Resulting annotations and BRIG analysis showed no lysogenic and virulence genes on BI-EHEC. Lysogenic bacteriophages utilize integrase and excisionase, encoded by int and xis, to bind their DNA to the host’s [6, 20, 22]. For virulence genes, analysis was done to three major virulence genes of EHEC: eae, lpf and stx, which encode for intimin, long polar fimbriae (LPF) and Shiga toxin respectively. Stx is the major virulence determinant of EHEC. Meanwhile, intimin and LPF aid the attachment of EHEC to its host cell .
Analysis using CARD database was done to determine whether samples carry over ARGs from the host or not. A temperate phage has a higher probability of carrying host genes, at this state, phage integrated their genome into the host and depends on hosts favorable conditions, phages could co-existence (prophages embedded) inside the DNA of the hosts, and there are possibility where temperate phage could carry host genes . The possibility to carry over ARGs is rarely found. It was suggested that up to 1000-fold uncommon for phages to transfer ARGs via transduction compared to other means .
Initial analysis using CARD was done using perfect and strict hits only parameters. However, this run yielded no results, which might indicate no ARGs present on BI-EHEC. Another analysis was conducted with the loose hits parameter, including hits with less than 95% homology matches across the database. It could be beneficial in detecting emerging threats. Unfortunately, it also produces homolog hits that might be unrelated to its function as ARGs. By including as many hits as possible, loose hits can detect unknown proteins which can potentially be a new antibiotic-resistant protein. However, it makes it less specific to detect actual antibiotic-resistant protein .
Analysis using loose hits showed TriC as the highest possible match for BI-EHEC. The suggested mechanism of action by TriC is antibiotic efflux . While ARGs were found during CARD analysis with loose hits parameter, it was still possible to rule out the presence of ARGs. Another study found that some proteins might be mistakenly labelled as ARGs while using CARD. This finding is common with phage genomes containing several leftover DNA from host cells. It was also suggested to use only conservative parameters when using in silico analysis to achieve the best possible matches, implying that only perfect and strict hits results are eligible to be included .
From the data produced by multiPhATE2, BRIG, and CARD analysis, it could be concluded that BI-EHEC leans towards following a strictly lytic life cycle and ARGs were not found in bacteriophage genome.
BI-EHEC were successfully annotated, including structural and lytic cycle-related genes. The phylogenetic tree of tail fiber protein and BRIG results showed that BI-EHEC were similar to phages of the same host in NCBI. There were no signs of virulence or lysogenic protein among annotated genes, and PHACTS analysis confirmed this notion further. CARD results indicate no ARGs present on BI-EHEC. It can be concluded that BI-EHEC is promising as candidate for food preservative.
The lack of annotated genes (resulting in many hypothetical protein hits) on the database has proven to be the limitation of this research.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Enterohemorrhagic Escherichia coli
Antimicrobial resistance genes
Next generation sequencing
Basic local alignment search tool
Molecular evolutionary genetics analysis
BLAST ring image generator
Phage classification tool set
Comprehensive antibiotic resistance database
National Centre for Biotechnology Information
Berger CN, Sodha SV, Shaw RK, Griffin PM, Pink D, Hand P, Frankel G. Fresh fruit and vegetables as vehicles for the transmission of human pathogens. Environ Microbiol. 2010;12(9):2385–97. https://doi.org/10.1111/j.1462-2920.2010.02297.x.
Moye Z, Woolston J, Sulakvelidze A. Bacteriophage applications for food production and processing. Viruses. 2018;10(205):1–22. https://doi.org/10.3390/v10040205.
Pastagia M, Schuch R, Fischetti VA, Huang DB. Lysins: the arrival of pathogen-directed anti-infectives. J Med Microbiol. 2013;2013(62):1506–16. https://doi.org/10.1099/jmm.0.061028-0.
Lukman C, Yonathan C, Magdalena S, Waturangi DE. Isolation and characterization of pathogenic Escherichia coli bacteriophages from chicken and beef offal. BMC Res Notes. 2020;13(8):1–7. https://doi.org/10.1186/s13104-019-4859-y.
Crothers-Stomps C, Høj L, Bourne DG, Hall MR, Owens L. Isolation of lytic bacteriophages against Vibrio harveyi. J of Appl Microbiol. 2010;108(5):1744–50. https://doi.org/10.1111/j.1365-2672.2009.04578.x.
Rasool MH, Yousaf R, Siddique AB, Saqalein M, Khurshid M. Isolation, characterization, and antibacterial activity of bacteriophages against methicillin-resistant Staphylococcus aureus in Pakistan. Jundishapur J Microbiol. 2016;9(10):1–8. https://doi.org/10.5812/jjm.36135.
Thung TY, Norshafawatie SBMF, Premarathne JMKJK, Chang WS, Loo YY, Kuan CH, New CY, Ubong A, Ramzi OSB, Mahyudin NA, Dayang FB, Jasimah WMR, Son R. Isolation of food-borne pathogen bacteriophages from retail food and environmental sewage. Int FRJ. 2018;24(1):450–4.
O’Flynn G, Ross RP, Fitzgerald GF, Coffey A. Evaluation of a cocktail of three bacteriophages for biocontrol of Escherichia coli O157:H7. J App Environ Microbiol. 2004;70(6):3417–24. https://doi.org/10.1128/AEM.70.6.3417-3424.2004.
Wick RR, Menzel P. 2018. Filtlong. Accessed 2 August 2021: https://github.com/rrwick/Filtlong.
Kolmogorov M, Yuan J, Lin Y, Pevzner P. Assembly of long error-prone reads using repeat graphs. Nat Biotechnol. 2019. https://doi.org/10.1038/s41587-019-0072-8.
Nanopore Tech. 2021. Medaka. Accessed 2 August 2021: https://github.com/nanoporetech/medaka.
Zhou CLE, Kimbrel J, Edwards R, McNair K, Souza BA, Malfatti S. MultiPhATE2: code for functional annotation and comparison of phage genomes. G3. 2021;11(5):1–5. https://doi.org/10.1093/g3journal/jkab074.
Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 2018;35(6):1547–9. https://doi.org/10.1093/molbev/msy096.
Kim GH, Kim JW, Kim J, Chae PJ, Lee JS, Yoon SS. Genetic analysis and characterization of a bacteriophage ØCJ19 active against enterotoxigenic Escherichia coli. Food Sci Anim Resour. 2020;40(5):746–57. https://doi.org/10.5851/kosfa.2020.e49.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;1990(215):403–10. https://doi.org/10.1016/S0022-2836(05)80360-2.
Alikhan NF, Petty NK, Zakour NLB, Beatson S. BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons. BMC Genom. 2011;12(402):1–10. https://doi.org/10.1186/1471-2164-12-402.
Alcock BP, Raphenya AR, Lau TTY, Tsang KK, Bouchard M, Edalatmand A, Huynh W, Nhuyen ALV, Cheng AA, Liu S, Min SY, et al. CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database. Nucleic Acids Res. 2020;48(1):517–25. https://doi.org/10.1093/nar/gkz935.
McNair K, Bailey BA, Edwards RA. PHACTS, a computational approach to classifying the lifestyle of phages. Bioinformatics. 2012;28(5):614–8. https://doi.org/10.1093/bioinformatics/bts014.
Roussel C, Cordonnler C, Galla W, Goff OL, Thévenot J, Chalancon S, Thevenor-Sergentet D, Leriche F, Wiele TVd, et al. Increased EHEC survival and virulence gene expression indicate an enhanced pathogenicity upon simulated pediatric gastrointestinal conditions. Pediatr Res. 2016;80(5):734–43. https://doi.org/10.1038/pr.2016.144.
Doss J, Culbertson K, Hahn D, Camacho J, Barekzi N. A review of phage therapy against bacterial pathogens of aquatic and terrestrial organisms. Viruses. 2017;9(50):1–10. https://doi.org/10.3390/v9030050.
Fauquet CM, Mayo MA, Maniloff J, Desselberger U, Ball LA, editors. The double stranded DNA viruses. Massachusetts: Academic press; 2005.
Granoff A, Webster RG, editor. Encyclopedia of Virology (Second Edition). Amsterdam (NL): Elsevier.
Yap ML, Rossmann MG. Structure and function of bacteriophage T4. Future Microbiol. 2014;2014(9):11319–27.
Häuser R, Blasche S, Dokland T, Haggård-Ljungquist E, Av B, Salas M, Casjens S, Molineux I, Uetz P. Bacteriophage protein–protein interactions. Adv Virus Res. 2012;2012(83):219–98. https://doi.org/10.1016/B978-0-12-394438-2.00006-2.
North OI, Davidson AR. Phage proteins required for tail fiber assembly also bind specifically to the surface of host bacterial strains. J Bacteriol. 2020;2013(3):1–19. https://doi.org/10.1128/JB.00406-20.
Simpson DJ, Sacher JC, Szymanski CM. Development of an assay for the identification of receptor binding proteins from bacteriophages. Viruses. 2016;8(1):17. https://doi.org/10.3390/v8010017.
Chen M, Zhang L, Abdelgader SA, Yu L, Xu J, Yao H, Lu C, Zhang W. Alterations in gp37 expand the host range of a T4-like phage. Appl Environ Microbiol. 2017;83(23):01576–617. https://doi.org/10.1128/aem.01576-17.
Olsen NS, Forero-Junco L, Kot W, Hansen LH. Exploring the remarkable diversity of culturable Escherichia coli phages in the Danish wastewater environment. Viruses. 2020;12(9):986. https://doi.org/10.3390/v12090986.
Enault F, Briet A, Bouteille L, Roux S, Sullivan MB, Petit MA. Phages rarely encode antibiotic resistance genes: a cautionary tale for virome analyses. ISMEJ. 2017;2017(11):237–47. https://doi.org/10.1038/ismej.2016.90.
Belinda L, Jiayuan C, Prasanth M, Yunsong Y, Xiaoting H, Sebastian L. A biological inventory of prophages in A. baumannii genomes reveal distinct distributions in classes, length, and genomic positions. Front Microbiol. 2020;11(2020):3055.
Zheng W, Xu W, Xu Y, Liao W, Zhao Y, Zheng X, Xu C, Zhou T, Cao J. The prevalence and mechanism of triclosan resistance in Escherichia coli isolated from urine samples in Wenzhou, China. Antimicrob Resist Infect Control. 2020;9(161):1–10. https://doi.org/10.1021/bi00152a001.
The authors acknowledge research funding support by Indonesian Ministry of education and culture through the national research Grant 2020- Fundamental research.
This study was funded by DIKTI 2020. The funder has no contribution in design, collection, writing, and interpreting data in this study.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Table S1 Full annotations of US-EHEC.
Figure S1 Genome map of BI-EHEC. Annotation was selected based on its role on lyric cycle and/or structural.
Table S2 List of tail fiber from NCBI database and its accession number.
Table S3 US-EHEC CARD results (highest to lowest best identities).
About this article
Cite this article
Dewanggana, M.N., Waturangi, D.E. & Yogiara Genomic characterization of bacteriophage BI-EHEC infecting strains of Enterohemorrhagic Escherichia coli. BMC Res Notes 14, 459 (2021). https://doi.org/10.1186/s13104-021-05881-5
- Food preservatives
- Biocontrol agent
- Genome sequence