The diversity of unique 1,4,5,6-Tetrahydro-2-methyl-4-pyrimidinecarboxylic acid coding common genes and Universal stress protein in Ectoine TRAP cluster (UspA) in 32 Halomonas species

Objectives To decipher the diversity of unique ectoine-coding housekeeping genes in the genus Halomonas. Results In Halomonas, 1,4,5,6-Tetrahydro-2-methyl-4-pyrimidinecarboxylic acid has a crucial role as a stress-tolerant chaperone, a compatible solute, a cell membrane stabilizer, and a reduction in cell damage under stressful conditions. Apart from the current 16S rRNA biomarker, it serves as a blueprint for identifying Halomonas species. Halomonas elongata 1H9 was found to have 11 ectoine-coding genes. The presence of a superfamily of conserved ectoine-coding among members of the genus Halomonas was discovered after genome annotations of 93 Halomonas spp. As a result of the inclusion of 11 single copy ectoine coding genes in 32 Halomonas spp., genome-wide evaluations of ectoine coding genes indicate that 32 Halomonas spp. have a very strong association with H. elongata 1H9, which has been proven evidence-based approach to elucidate phylogenetic relatedness of ectoine-coding child taxa in the genus Halomonas. Total 32 Halomonas species have a single copy number of 11 distinct ectoine-coding genes that help Halomonas spp., produce ectoine under stressful conditions. Furthermore, the existence of the Universal stress protein (UspA) gene suggests that Halomonas species developed directly from primitive bacteria, highlighting its role during the progression of microbial evolution. Supplementary Information The online version contains supplementary material available at 10.1186/s13104-021-05689-3.


Introduction
1,4,5,6-Tetrahydro-2-methyl-4-pyrimidinecarboxylic acid (C 6 H 10 N 2 O 2 , molecular weight 142.16) is a natural pigment produced within the cytoplasm of salt-loving bacteria (e.g. genus Ectothiorhodospira; Halomonas). This pigment helps bacterium to perform osmoregulatory function termed as 'ectoine' in general [6]. The moderately halophilic members of family Halomonadaceae displays osmoadaptation facilitated by pigments such as betaine and ectoine [3] and hydroxyectoine [12]. Family Halomonadaceae possess total 18 child taxa. Of these, names of 14 child taxa are validly published with their correct name. Other16 child taxa have their validly published name including synonyms under the International Code of Nomenclature of Prokaryotes (ICNP). On similar note, currently Genus Halomonas represented by 114 type strains with 112 candidates having validity published name and correct name and 10 candidates with synonyms. Also, three species have orthographic misspelled variants, and 18 invalidated species were not validated by ICNP [7]. Description of all Halomonas species is given on LPSN portal managed by Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, Germany. Halomonas species are known producer of biotechnologically important ectoine. Being suspended in the cytoplasm, ectoine and hydroxyectoine coded by Halomonas species has benefits to cell. It performs various activities in cell such as stress tolerant chaperones, as a compatible solute, stabilize of cell membrane and reduce cell damage [10]. Moreover, ectoine and hydro-ectoines are high-value chemicals and exploited for cosmetics, immune protection, stabilization of antibodies, anti-inflammatory and tissue protective agent, for co-production of bioplastic polyhydroxybutyrate [8], as a skin aging and protectant agent against harsh environments viz. radiation and extreme temperatures. Whole cell and macromolecule under hostile conditions were protected by intracellular ectoine from freezing, drying, high salinity, heat stress, oxygen radicals, radiation and denaturing agents [10]. Various applications of ectione produced by Halomonas species reflect presence of diverse gene profiles and other conserved genes in their genomes. It is therefore important to evaluate indicative signatures genes that codes ectoine and governs vital biological function under extreme environmental conditions among the genus Halomonas.
Present study is a blue print of ectoine coding genes identified from H. elongata. Genome annotations of existing Halomonas spp., have uncovered existence of some common genes that codes ectoine (s) among members of the genus Halomonas. Thus, genome-wide evaluations of ectoine coding genes were assessed. We also analyzed highly close 32 Halomonas spp., with Halomonas elongata 1H9, which has phylogenetic related ectoine coding child taxa inferred using identified single copy genes.

type strains 16S rRNA genes and 94 Halomonas spp., genomes
One hundred twenty-eight 16S rRNA genes of type strains and 94 complete genomes and reference sequences of Halomonas spp., were obtained from LPSN and NCBI genome database deposited during 2006 to 2020.

Radar chart
Halomonas spp., possesses multiple quantitative variables (species in particular) i.e. variable genome length/ data points for visualization. Radar chart makes the way easy to compare the intra-species variable length to see similar values and find high or low scoring within outliers in the genus.

Identification of protein families and single copy genes
Protein families and single-copy genes in 93 Halomonas spp., were identified using PATRIC 3.6.9 (https:// www. patri cbrc. org/). PLfams within the genus were computed with MCL inflation = 3.0 to obtain higher sequence similarity and better specificity for intra-genus/species close comparisons.

Selecting single copy number genes
PLfams of 1,4,5,6-Tetrahydro-2-methyl-4-pyrimidinecarboxylic acid coding genes among 93 Halomonas spp., were extracted. Common genes coded by Halomonas species were selected for analysis. The topology of the phylogenetic tree generated using concatenated sequences was compared with the topology of 16S rRNA based Halomonas spp., child taxa tree.

Phylogeny reconstruction and topology analysis
The evolutionary history of one hundred twenty-eight16S rRNA and 33 Halomonas single-copy genes were inferred using standalone tool MEGA X with 1000 bootstrap analysis followed by best scoring ML, NJ and ME tree. The Jukes-Cantor method and are in the units of the number of base substitutions per site. The closest child taxa of biotechnological important ectoine producing H. elongata 1H9 were deciphered. It helps for phylogenetic analysis and topology comparison to delineate nearest species and 1,4,5,6-Tetrahydro-2-methyl-4-pyrimidinecarboxylic acid gene coding species.

Identification of protein families, single copy genes and Pearson correlation
Whole-genome analyses and annotation have resolved the misery of unique genes distributed among the genus Halomonas spp. The radar chart shows that existing genomic data of Halomonas spp., possesses complete genome sequences, reference genomes and some scaffolds (Additional file 3: Figure S3, Additional File 6: Table S1). Available genomic sequence data shows a similar gene pool and all ectoine-coding sequences from 93 type strains not having sets of genes. To resolve this issue and find relevant species in the genus Halomonas, we, therefore, annotated all genomes and identified the single-copy gene that codes ectoine. It was noticed that few Halomonas species that more than 11 single copy ectoine-coding genes. Therefore, inferred ML tree (Additional file 4: Figure S4) some type strains shows that ectoine biomarker (in 1H9, F9-6, AJ261, SP4, ACAM 71, 62, Hb3, DSM 15,911, N12, NTU-107, G-16.1, ZJ2214, TBZ3, M29, 79, BJGMM-B45, LCB169, CFH 9008, AIR-2, DQD2-30, 4A, SL014B-69, TBZ202, DX6, 9-2 and MC28) possessed by species were more or less similar kind of representative species similar to concatenated sequence of 32 Halomonas species (Fig. 1). It was observed that of the 93 annotated genome sequences, 31 + 1 (32) species have 11 ectoine coding genes (DoeA-DoeC-DoeX-EctC-EctD-EutB-EutC-TeaA-TeaB-TeaC-UspA) as single copy number genes (Additional file 5: Figure S5; Table 1). Heatmap of 11 ectoine coding genes shows a high degree of Pearson correlation (Fig. 2) value lies between 0.50 and ± 1 (0 = no correlation, 1 = high degree correlation).   Moreover, ectoine or ectoine derivatives investigated by various groups worldwide for their biotechnological applications. For instance, few reports suggests that ectoine or ectoine derivatives were been in use for oral care, vulvovaginal conditions and in some in cosmetic formulations to protect cell damage and avoid microbial infections. For instance, reports suggest that ectoine and ectoin derivatives in combination with natural essential oil were employed as effective solution against pathogenic Pseudomonas aeruginosa [1] and antifungal resistant Candida strains causing candidiasis [4,5]. Therefore, in biotechnological perspectives ectoine and derivatives of ectoines may have application against antimicrobial resistance and multi-drug resistant microorganisms.

Conclusion
Ectoine signatures can be found in 93 Halomonas genome sequences that are publicly available. 32 Halomonas species have 11 separate ectoine genes in a single copy number in their genomes, which help Halomonas spp. produce ectoine under stressful conditions. Based on existing genomic data, it was discovered that H. elongata 1H9 has distinct ectoine-producing machinery from other Halomonas species. The existence of 11 distinct genes in 32 species, including the UspA gene, suggests that Halomonas species evolved directly from their primitive ancestor, shedding light on their evolutionary significance.