- Research note
- Open Access
Computational characterization and analysis of molecular sequence data of Elizabethkingia meningoseptica
BMC Research Notes volume 15, Article number: 133 (2022)
Elizabethkingia meningoseptica is a multidrug resistance strain which primarily causes meningitis in neonates and immunocompromised patients. Being a nosocomial infection causing agent, less information is available in literature, specifically, about its genomic makeup and associated features. An attempt is made to study them through bioinformatics tools with respect to compositions, embedded periodicities, open reading frames, origin of replication, phylogeny, orthologous gene clusters analysis and pathways.
Complete DNA and protein sequence pertaining to E. meningoseptica were thoroughly analyzed as part of the study. E. meningoseptica G4076 genome showed 7593 ORFs it is GC rich. Fourier based analysis showed the presence of typical three base periodicity at the genome level. Putative origin of replication has been identified. Phylogenetically, E. meningoseptica is relatively closer to E. anophelis compared to other Elizabethkingia species. A total of 2606 COGs were shared by all five Elizabethkingia species. Out of 3391 annotated proteins, we could identify 18 unique ones involved in metabolic pathway of E. meningoseptica and this can be an initiation point for drug designing and development. Our study is novel in the aspect in characterizing and analyzing the whole genome data of E. meningoseptica.
In 1959, Elizabeth O King, discovered Elizabethkingia (renamed in 2005) , earlier known as Chryseobacterium. It is a non-glucose fermenting, non-motile, catalase-oxidase positive gram negative bacteria belonging to Flavobacteriaceae family, ubiquitous in soil, fresh and salty water . The genus comprises of six species  that is, E. meningoseptica associated with meningitis and sepsis in premature neonates, [4, 5] E. anophelis isolated from the midgut of Anopheles gambiae mosquitoes which causes respiratory tract illness in human , E. miricola, isolated from condensation water on the Mir space station of Russia collected in 1997 , and E. brunniana, E. ursingii and E. occulta (three CDC genomospecies) .
Elizabethkingia meningoseptica is causative agent of meningitis in neonates and sepsis in immunocompromised patients . The occurrence of nosocomial infection has risen, mainly in patients, with prolonged hospitalization, treated with invasive procedures, subsequently on use of broad-spectrum antimicrobials as well as having concomitant infections . The mortality rate in patients infected with E. meningoseptica is significantly higher due to its unusual resistance pattern and mechanism . Further studies are needed to initiate the most effective therapeutic approach. One can follow the time consuming and labor-intensive experimental approach but advancement in bioinformatics field provided enormous software tools, that are used to analyze and extract information from the molecular sequence, structure, expression and pathway data [12, 13].
The current study focused on analyzing the whole genome data of Elizabethkingia to unravel the embedded features hitherto not reported, secondly to explore the possibility of getting some lead in the directions of possible novel therapeutic candidates. Accordingly, we have studied genomic features, origin of replication sites, phylogenetic relationships, comparative genomics among E. meningoseptica species and further explored subtractive genomics approach together with pathway analysis.
Genome analysis of E. meningoseptica G4076 and its comparisons with Elizabethkingia family
The whole genome (Accession Number NZ_CP016376) and protein sequences of Elizabethkingia meningoseptica G4076 were downloaded from NCBI (www.ncbi.nlm.nih.gov). Nucleotide composition of genome was obtained using ORIS software . To find all open reading frames in the genome, ORF finder, a graphical tool was used (https://www.ncbi.nlm.nih.gov/orffinder/) . CG-Viewer was used for plotting circular plot of genomes . Discrete Fourier Transform based computational approach using customized python codes was carried out to see the typical three-base periodicity feature embedded in E. meningoseptica genomic sequence . Rapid Annotation using Subsystem Technology (RAST) server was carried out for studying genome annotation [18, 19]. Ori-Finder  and ORISv1.0  software tools were used to identify putative origin of replication (oriC) sites in the genome. MegaX software was utilized to carry out phylogenetic analysis for species within the same genus such as E. miricola, E. meningoseptica, E. anophelis, E. bruuniana, E. ursingii and E. occulta as well as Flavobacterium coloumnare ATCC49512, Riemerella anatipestifer ATCC11845 (other genus in same family) . The orthologous gene identification among Elizabethkingia species was carried out using Orthovenn2 with default parameters [22, 23].
Subtractive genomics based computational analysis
All protein sequences of Elizabethkingia meningoseptica G4076 and Homo sapiens (Host) were downloaded from NCBI database [24, 25]. Out of the total 3406 proteins in E. meningoseptica, hypothetical proteins and proteins having length less than 100 amino acids were discarded. Remaining 2503 proteins were subjected to BLASTP against proteomes of Homo sapiens . Based on previous studies, expectation value cut off of 10–4 and minimum bit score of 100 used as threshold to shortlist non-homologous proteins . Further, these non-homologous proteins were queried against Database of Essential Genes (DEG) server to get a list of essential genes for E. meningoseptica using e-value cut off 10–10 and bit score value of 100 as threshold . These shortlisted essential genes that were non-homologous to host and essential for bacteria were studied further with respect to metabolic pathway.
Metabolic pathway analysis and subcellular localization prediction
Essential non-homologous proteins of E. meningoseptica, were further analyzed using KAAS (KEGG Automated Annotation Server) in order to study metabolic pathways . KEGG analysis performed BLAST comparison against available KEGG gene database and provide metabolic pathway maps including KO and EC number for a particular gene. To determine the location of proteins in a cell PSORTb version 3.0 server was used . The essential gene subjected to BLASTP analysis against FDA approved drug targets from Drugbank to search novel drug targets. Targets with identification of more or equal 80% are druggable targets and others that show considerable low degree of matching with already approved drug target can be used as novel targets for new drug identification .
Genomic features of E. meningoseptica G4076 and its comparison with other species
The whole genome data of E. meningoseptica G4076 having length of 3,873,125 bp showed a mean GC content of 36.5%, number of genes as per annotation is 3477 and the percentage base composition viz %A ≈ %T i.e., 31.76 and %G ≈ %C i.e., 18.23 calculated using ORIS software  (Additional file 1: Figure S1) which is in agreement with Chargaff’s parity rule . Open reading frame is effective in identifying genes that encodes proteins. Total number of 7593 ORFs were found in whole genome. The products are of varying length and it shows that the number of ORFs found are actually slightly more than the annotated number of proteins (Additional file 1: Figure S2). To visualize sequence conservation, the circular genome plot was created using CG view Server (Additional file 1: Figure S3). Gene coding segments of E. meningoseptica genome does show the typical three-base periodicity indicating underlying codon structure that enables us to predict and identify all possible genes in majority of the bacterial genome with very high accuracy . Additional file 1: Figure S4A shows all the bases considered for the fourier spectrum and indicates the presence of three base periodic signal as seen in most of bacterial genomes. Signal strength is prominent for purine-pyrimidine (Additional file 1: Figure S4B) whereas in the case of individual bases it is considerably low (Additional file 1: Figure S4C–F).
RAST server shows annotated data indicating 3477 putative genes, 61 RNAs which includes 4,4,4 (5S, 16S, 23S) ribosomal RNAs and 49 tRNAs and 335 subsystems (set of functional role) under 27 categories . Sixty two coding sequences were related with antibiotics resistance and toxic compounds which suggests E. meningoseptica might be multiple drug resistant (Additional file 1: Figure S5).
Ori-Finder (a web based software tool for finding oriCs) predicted oriC region of 649 bp ranging from 740,720 bp to 741,368 bp having three DnaA box sequence motifs (TTATCCACA) with no more than one mismatch. Further, replication related gene, dnaA located from 2,613,273 to 2,614,727 bp which is followed by dnaN gene (Fig. 1A) . A cluster of three DnaA boxes and two AT rich DNA unwinding elements (DUE) are indication of functional chromosomal origin (Fig. 1F). Similar kind of result was found with ORIS v1.0 software tool. DNA asymmetry, distribution of DnaA boxes as well as location of the dnaA gene help in predicting OriC regions [33,34,35,36]. Both graphs enable us to pin-point or identify ORI/TER site. The difference in the position (genome coordinates) of OriC predicted by Ori-Finder and ORIS are well within 1 kb and hence, close agreement.
Genomic comparison among Elizabetkingia species [E. meningoseptica G4076 (WP_016198861.1), E. miricola BM10 (WP_034866598.1), E. ursingii G4123 (WP_078402796.1), E. anopheles NUHP1(WP_009086312.1) E. bruuniana G0146(WP_034866598.1), F. columnare ATCC49512(WP_014166114.1), R. anatipestifer ATCC11845(WP_004918717.1)] has been done using MEGAX software. It depicts phylogenetic relatedness by comparing homology of protein sequence specifically 16S rRNA processing Protein RimM (Ribosomal maturation factor RimM) (Additional file 1: Figure S6) . It has been found that E. meningoseptica are relatively at a large phylogenetically distance from other species of Elizabethkingia. Cluster of orthologous gene analysis of E. meningoseptica G4076 was compared with four other species of Elizabethkingia to provide insights into biological process, molecular functions and cellular components [22, 23]. It was found that among 3970 clusters, 1401 were orthologous clusters which contain at least two species and 2569 singletons. The number of orthologous genes shared by five species of Elizabethkingia genome was 2606 whereas 17 COGs were present only in Elizabethkingia meningoseptica G4076 genome which is involved only in metallopeptidase activity (Additional file 1: Figure S7). In pairwise comparison ranges varies from 3396 to 3409 COGs (Additional file 1: Figure S7C).
Prediction of essential genes in Elizabethkingia meningoseptica
Subtractive genomic analysis is unique, fast and efficient method for identifying essential genes in pathogenic species that are non-homologous to human (host). These non-homologous essential genes can be used as putative drug targets against pathogens . The genome of E. meningoseptica G4076 has 3391 annotated proteins. After exclusion of protein which are < 100 amino acids and hypothetical, remaining 2503 were subjected to BLASTP against proteins of Homo sapiens (host). Using e-value cut off 10–4 and bit score > 100, it was found total of 2052 proteins were non-homologous to host protein. Thereafter, these proteins were subjected to BLAST analysis using DEG server and using e-value cut off 10–10 and bit score > 100, shortlisted 692 proteins that are essential for E. meningoseptica G4076 but absent in host (Additional file 1: Table S1). DEG contains gene that plays important role in cell survival and can be novel targets for antibacterial drugs (Fig. 2).
Metabolic pathway analysis of essential gene and subcellular localization prediction
The shortlisted non-homologous essential genes were analyzed using KEGG database for metabolic pathway annotation. It was found, only 41 out of 692, are present in pathogen as unique pathways (Table 1). Majority of them were involved in DNA binding response regulator, ribosomal proteins, replication and repair, Glycan biosynthesis, protein folding and sorting, two-component system, biotin metabolism and ATP transporters. It is very important for drug designing to determine whether target protein resides on cell surface or in cytoplasm. Localization of proteins play important role in drug binding and action. Subcellular localization reveals, out of 41 target proteins, 80% of total are cytoplasmic, rest located in periplasm or cytoplasmic membrane and no extracellularly proteins were obtained (Additional file 1: Figure S8). Extracellularly secreted proteins may be better opted for vaccine development. Here, it is clear that majority of proteins resides in cytoplasm and cytoplasmic membrane that further can be considered as potential therapeutic targets. Unique E. meningoseptica essential proteins non-homologous to host further subjected to BLASTP against FDA approved drug targets from Drugbank which shortlisted to 18 target proteins. Out of which penicillin binding protein (2), ABC transporter ATP binding proteins (2) that targets for broad-spectrum antibiotics. The rest includes ribosomal proteins (rpsB, rpsl, rpsG, rpsJ, rpsE, rpsM, rpsK, rpsD, rplD, rplP), recombination protein (recR), DNA polymerase subunit III tau (dnaX), and signal peptidase which could be further explored as starting point for discovering novel drug candidate. Ribosomal proteins can be more suitable candidates for drug binding as it mainly involves in translation. Another work also lend support for choosing the specific drug target . In that regard, computational analysis may include homology modelling and docking of selected candidate.
Meningitis and sepsis is a major illness in newborn and immunocompromised patients caused by Elizabethkingia meningoseptica. Though typical clinical diagnostics are used to identify the illness but a greater understanding of molecular based diagnosis is desired and it is a long term goal. Increase in number of cases in Intensive care units (ICUs) makes it big challenge for clinicians to deal and manage. In this context, comprehensive analysis of whole genome data and pathway analysis were explored as we do not see much work related to computational analysis. Accordingly, bioinformatics approach was undertaken for characterizing molecular sequence data of Elizabethkingia. Our study identified 41 unique proteins in Elizabethkingia with respect to the host using subtractive genomics which further narrow down to18 therapeutic target proteins using in-silico comparative genomics. The suitable shortlisted ribosomal proteins which are linked to translation may be useful for future treatment and management of the infection. We have studied in an integrated fashion of considering and analyzing sequence data of E. meningosptica together with pathway analysis. Our study is small step in the direction of rapid diagnosis and possible drug development.
The current investigation is limited to in silico study only.
Availability of data and materials
The whole genome sequence of Elizabethkingia meningoseptica G4076 having Accession Number NZ_CP016376 was downloaded from NCBI site https://www.ncbi.nlm.nih.gov/genome/14625?genome_assembly_id=309079. All the protein sequences (numbering 3406) available in FASTA format were used for BLASTP analysis against human dataset option. Selected protein sequences (described in material method section) were further used as input for subtractive genomic analysis.
Open reading frame
Rapid Annotations using Subsystems Technology
Cluster of groups
Centre of Disease Control
National Center for Biotechnology Information
Origin or replication C
Molecular evolutionary genetic analysis
American type culture collection
DNA unwinding element
Intensive Care Units
Basic Local Alignment Search Tool
Database of essential genes
Kyoto Encyclopedia of Genes and Genomes
Food and Drug Administration
King EO. Studies on a group of previously unclassified bacteria associated with meningitis in infants. Am J Clin Pathol. 1959;31(3):241–7. https://doi.org/10.1093/ajcp/31.3.241.
Ceyhan M, Celik M. Elizabethkingia meningosepticum (Chryseobacterium meningosepticum) infections in children. Int J Pediatr. 2011. https://doi.org/10.1155/2011/215237.
Lin J, Lai C, Yang C, Huang Y. Elizabethkingia infections in humans: from genomics to clinics. Microorganisms. 2019;7:295. https://doi.org/10.3390/microrganisms7090295.
Hazuka BT, Dajani AS, Talbot K, Keen BM. Two outbreaks of Flavobacterium meningosepticum type E in neonatal intensive care unit. J Clin Microbiol. 1977;6(5):450–5.
Amer MZ, Bandey M, Bukhari A, Nemenquani D. Neonatal meningitis caused by Elizabethkingia meningoseptica in Saudi Arabia. J Infect Dev Ctries. 2011;5(10):745–7. https://doi.org/10.3855/jidc.1570.
Kämpfer P, Matthews H, Glaeser SP, Martin K, Lodders N, Faye I. Elizabethkingia anophelis sp. nov., isolated from the midgut of the mosquito Anopheles gambiae. Int J Syst Evol Microbiol. 2011;61:2670–5. https://doi.org/10.1099/ijs.0.026393-0.
Kim KK, Kim MK, Lim JH, Park HY, Lee ST. Transfer of Chryseobacterium meningosepticum and Chryseobacterium miricola to Elizabethkingia gen. nov. as Elizabethkingia meningoseptica comb. nov. and Elizabethkingia miricola comb. nov. Int J Syst Evol Microbiol. 2005;55:1287–93. https://doi.org/10.1099/ijs.0.63541-0.
Nicholson AC, Gulvik CA, Whitney AM, et al. Revisiting the taxonomy of the genus Elizabethkingia using whole-genome sequencing, optical mapping and MALDI-TOF, along with proposal of three novel Elizabethkingia species: Elizabethkingia brunniana sp. nov., Elizabethkingia ursingii sp. nov., and Elizabethkingia occult sp. nov. Antonie Van Leeuwenhoek. 2018;111(1):55–72. https://doi.org/10.1007/s10482-017-0926-3.
Bloch KC, Nadarajah R, Jacobs R. Chryseobacterium meningosepticum: an emerging pathogen among immunocompromised adults. Medicine. 1997;76(1):30–41. https://doi.org/10.1097/00005792-199701000-00003.
Pereira GH, Garcia Dde O, Abboud CS, Barbosa VL, Silva PS. Nosocomial infections caused by Elizabethkingia meningoseptica: an emergent pathogen. Braz J Infect Dis. 2013. https://doi.org/10.1016/j.bjid.2013.02.011.
Young SM, Lingam G, Tambyah PA. Elizabethkingia meningoseptica endogenous endopthalmitis—a case report. Antimicrob Resist Infect Control. 2014;3:35. https://doi.org/10.1186/2047-2994-3-35.
Hogeweg P. The roots of bioinformatics in theoretical biology. PloS Comput Biol. 2011. https://doi.org/10.1371/journal.pcbi.1002021.
Altaf-Ul-Amin M, et al. Recent trends in computational biomedical research. Life (Basel). 2022. https://doi.org/10.3390/life12010027.
Singh, et al. ORIS: an interactive software tool for prediction of replication origin in prokaryotic genomes. J Open Source Softw. 2019;4(40):1589. https://doi.org/10.21105/joss.01589.
Grant JR, Stothard P. The CGView Server: a comparative genomics tool for circular genomes. Nucleic Acids Res. 2008;36:W181–4. https://doi.org/10.1093/nar/gkn179.
Tiwari S, Ramachandran S, Bhattacharya A, Bhattacharya S, Ramaswamy R. Prediction of probable genes by Fourier analysis of genomic sequences. Bioinformatics. 1997;13(3):263–70.
Aziz RK, Bartels D, Best AA, et al. The RAST server: rapid annotations using subsystems technology. BMC Genom. 2008;9:75. https://doi.org/10.1186/1471-2164-9-75.
Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, Edwards RA, Gerdes S, Parrello B, Shukla M, Vonstein V, Wattam AR, Xia F, Stevens R. The SEED and the rapid annotation of microbial genomes using subsystems technology (RAST). Nucleic Acids Res. 2014;42:D206–14. https://doi.org/10.1093/nar/gkt1226.
Gao F, Zhang C. Ori-Finder: a web-based system for finding oriCs in unannotated bacterial genomes. BMC Bioinform. 2008;9:79. https://doi.org/10.1186/1471-2105-9-79.
Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 2018;35(6):1547–9. https://doi.org/10.1093/molbev/msy096.
Xu L, Dong Z, Fang L, Luo Y, Wei Z, Guo H, Zhang G, Gu YQ, Derr DC, Xia Q, Wang Y. OrthoVenn2: a web server for whole-genome comparison and annotation of orthologous clusters across multiple species. Nucleic Acids Res. 2019;47(W1):W52–8. https://doi.org/10.1093/nar/gkz333.
Wang Y, Coleman-Derr D, Chen G, Gu YQ. OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species. Nucleic Acids Res. 2014;43(W1):W78–84. https://doi.org/10.1093/nar/gkv487.
Nicholson AC, Humrighouse BW, Graziano JC, et al. Draft genome sequences of strains representing each of the Elizabethkingia genomospecies previously determined by DNA-DNA hybridization. Genome Announc. 2016. https://doi.org/10.1128/genomeA.00045-16.
Lander ES, Linton LM, Biren B, et al. Initial sequencing and analysis of the human genome. Nature. 2001. https://doi.org/10.1038/35057062.
Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.
Acharya A, Garg L. Drug target identification and prioritization for treatment of Ovine foot rot: an in-silico approach. Int J Genom. 2016. https://doi.org/10.1155/2016/7361361.
Zhang R, Ou HY, Zhang CT. DEG: a database of essential genes. Nucleic Acids Res. 2004;32(2):D271. https://doi.org/10.1093/nar/gkh024.
Kanehisa M, Furumichi M, Sato Y, et al. KEGG: integrating viruses and cellular organisms. Nucleic Acids Res. 2021;49:D545–51.
Yu NY, Wagner JR, Laird MR, et al. PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes. Bioinformatics. 2010;26(13):1608–15.
Hossain T, Kamruzzaman M, Choudhury TZ, et al. Application of subtractive genomics and molecular docking analysis for the identification of novel putative drug targets against Salmonella enterica subsp. Enterica serovar Poona. Biomed Res Int. 2017. https://doi.org/10.1155/2017/3783714.
Parker J. Base composition. Encycl Genet. 2001. https://doi.org/10.1006/rwgn.2001.0115.
Breurec S, Criscuolo A, Diancourt L, et al. Genomic epidemiology and global diversity of the emerging bacterial pathogen Elizabethkingia anophelis. Sci Rep. 2016;6:30379. https://doi.org/10.1038/srep30379.
Mackiewicz P, Zakrzewska-Czerwinska J, Zawilak A, Dudek MR, Cebrat S. Where does bacterial replication start? Rules for predicting the oriC region. Nucleic Acids Res. 2004;32(13):3781–91. https://doi.org/10.1093/nar/gkh699.
Gao F. Recent advances in the identification of replication origins based on the Z-curve method. Curr Genom. 2014;15(2):104–12. https://doi.org/10.2174/1389202915999140328162938.
Zhang CT, Zhang R, Ou HY. The Z curve database: a graphic representation of genome sequences. Bioinformatics. 2003;19(5):593–9. https://doi.org/10.1093/bioinformatics/btg041.
Roy S. Molecular markers in phylogenetic studies—a review. J Phylogenetics Evol Biol. 2014;2:2. https://doi.org/10.4172/2329-9002.1000131.
Uddin R, Saeed K. Identification and characterization of potential drug targets by subtractive genome analyses of methicillin resistant Staphylococcus aureus. Comput Biol Chem. 2014;48:55–53.
Polikanov YS, Alekashin NA, Bekert B, Wilson DN. The mechanisms of action of ribosome-targeting peptide antibiotics. Front Mol Biosci. 2018;5:48. https://doi.org/10.3389/fmolb.2018.00048.
The financial assistance provided to Neha Girdhar under Women Scientist Scheme-A (WOS-A) vide Reference No. SR/WOS-A/LS-222/2016 by Department of Science and Technology, Government of India is gratefully acknowledged.
Ethics approval and consent to participate
The authors declare that no ethical approval is required for current study.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1: Figure S1.
Percentile distribution of DNA base composition in E. meningoseptica G4076 genome. Figure S2. Open reading Frame viewer—a Window showing ORFs on the interval from 1 to 50,000 nucleotides. Figure S3. Circular genomic plot of E. meningoseptica. Figure S4. Fourier Transform Spectrum. Figure S5. Annotation of Elizabethkingia meningoseptica G4076 genome using RAST server. Figure S6. Phylogeny tree of Elizabethkingia species. Figure S7. Cluster of genes, Venn diagram and pairwise heat map among Elizabethkingia species. Figure S8. Pie-chart showing subcellular localization of proteins. Table S1. List showing subtractive genomic and metabolic pathway analysis result of E. menigoseptica.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Girdhar, N., Kumari, N. & Krishnamachari, A. Computational characterization and analysis of molecular sequence data of Elizabethkingia meningoseptica. BMC Res Notes 15, 133 (2022). https://doi.org/10.1186/s13104-022-06011-5
- Elizabethkingia meningoseptica
- Genome annotation
- Pathway analysis
- Subtractive genomics