The complete genome sequence of Hafnia alvei A23BA; a potential antibiotic-producing rhizobacterium

The urgent need for novel antibiotics cannot be overemphasized. Hafnia alvei A23BA was isolated from plant rhizosphere as part of an effort to recover novel antibiotic-producing bacterial strains from soil samples. The genome of the isolate was sequenced to facilitate mining for potential antibiotic-encoding biosynthetic gene clusters and to gain insights into how these gene clusters could be activated. Here, we report the complete genome sequence of H. alvei A23BA obtained from the hybrid assembly of Illumina HiSeq and GridION reads. The genome, consisting of a circular chromosome and a circular plasmid, is 4.77 Mb in size with a GC content of 48.77%. The assembly is 99.5% complete with genomic features including 4,217 CDSs, 125 RNAs, and 30 pseudogenes. Thiopeptide, beta-lactone, siderophore, and homoserine lactone biosynthetic gene clusters were also identified. Other gene clusters of interest include those associated with bioremediation, biocontrol, and plant growth promotion- all of which are reported for H. alvei for the first time. This dataset serves to expedite the exploration of the biosynthetic and metabolic potentials of the species. Furthermore, being the first published genome sequence of a soil isolate, this dataset enriches the comparative genomics study of H. alvei strains.


Objective
Bacterial secondary metabolites are invaluable sources of novel bioactive compounds. Many clinically useful antibiotics were derived from the secondary metabolites of soil dwelling bacteria [1]. However, only a small fraction of all known species have had their metabolites exploited in this way [2]. To this end, we sought to isolate novel antibiotic-producing bacterial strains from soil samples collected from the rhizosphere as antibiosis occurs naturally within it [3]. H. alvei A23BA was recovered as part of this effort.
Hafnia alvei is a Gram-negative, rod-shaped, facultatively anaerobic psychrotrophic bacterium. It is commonly isolated from clinical materials, gastrointestinal tract of animals, plant surfaces, soil, and water [4]. Some strains are commensals of the gastrointestinal tract while others are opportunistic pathogens implicated in both nosocomial and community-acquired infections [5,6]. It is almost never associated with antibiotic production except for the antimicrobial activities reported for a strain isolated from the gut of honeybees [7]. Phylogenetic studies of this little-known species have shown its pan-genome to be open and dynamic with each strain possessing sets of unique genes [8]. Unique gene acquisition is mainly by horizontal gene transfer, and it reflects the adaptation of strains to their remarkably diverse natural habitats. Strains show considerable metabolic Consequently, the genome of H. alvei A23BA was sequenced to enable mining for potential antibioticencoding secondary metabolite biosynthetic gene clusters (smBGCs) that show little or no homology to known smBGCs. Furthermore, assembled genomes of H. alvei in public repositories are typically of clinical, human or food isolates, to the best of our knowledge, the complete genome sequence of H. alvei A23BA represents the first published complete genome sequence of a soil isolate.

Data description
H. alvei A23BA was recovered from the rhizosphere of a garden plant in Aberdeen, Scotland (57.101 N 2.078 W). It was isolated using an ultra-minimal substrate medium (Data file 1) [9]. Upon isolation and strain purification, isolate was cultivated in nutrient broth (Oxoid, UK) at 37 °C for 24 h. Overnight culture was centrifuged and gDNA was extracted from pellets with the DNeasy ® Ultraclean ® Microbial Kit for DNA Isolation (Qiagen, UK). Isolate was preliminarily identified by 16S rRNA gene sequence comparison as H. alvei with 99% identity score.
In summary, the complete genome sequence of H. alvei A23BA is 4,772,047 bp in size with the overall GC content of 48.77% and sequencing coverage of 256.0 x. It comprises of one circular chromosome (4,687,005 bp; GC content 48.8%) and one circular plasmid (85,042 bp; GC content 47.2%). Genomic features include 4,217 CDSs, 25 rRNA, 92 tRNA, 8 ncRNA, 30 pseudogenes and 2 CRISPRs. Thiopeptide, beta-lactone (both showing little or no homology to known smBGCs) and siderophore smBGCs were identified (data file 7) [29]. Thiopeptides and beta-lactones are known for their antibiotic and/or anticancer activities [30,31], while siderophores are used clinically as "Trojan horse" to deliver antibiotics to antibiotic resistant bacteria [32]. Gene clusters commonly associated with bioremediation, biocontrol, environmental adaptation, and plant growth promotion were also identified (data file 8) [33]. Please see Table 1 for links to data files 1-8.
Given the quality control measures applied and results of analyses undertaken, we believe Hafnia alvei strain A23BA chromosome, complete genome [34] represents a high-quality dataset that would expedite the exploration of the biosynthetic and metabolic potentials of H. alvei A23BA and would also enrich the comparative genomics study of H. alvei strains.

Limitations
This dataset was generated from a hybrid assembly to ensure accuracy and completeness. Furthermore, the hybrid assembler (Unicycler) autocorrects read errors and polishes final assemblies several times to ensure accuracy. Annotations and metabolic pathway analyses were carried out with robust and validated bioinformatics tools, and smBGCs were identified with the most comprehensive genome mining tool to date. Therefore, the authors are currently unaware of any limitations of the data.