Skip to main content

Genomic sequencing of fourteen Bacillus thuringiensis isolates: insights into geographic variation and phylogenetic implications



This work was performed in support of a separate study investigating the activity of pesticidal proteins produced by Bacillus thuringiensis against the Asian citrus psyllid, Diaphorina citri. The fourteen Bacillus isolates chosen were selected from a large, geographically diverse collection that was characterized only by biochemical phenotype and morphology of the parasporal crystal, hence, for each isolate it was desired to determine the specific pesticidal proteins produced, assign each to a Bacillus cereus multilocus sequence type (ST), and predict their placement within the classical Bt serotyping system. In addition, phylogenetic distances between the isolates and Bacillus thuringiensis serovar type strains were determined by calculating digital DNA-DNA hybridization (dDDH) values among the isolates.


Based on the assembled sequence data, the isolates were found to be likely representatives of the Bt serovars kurstaki (ST 8), pakistani (ST 550), toumanoffi (ST 240), israelensis (ST 16), thuringiensis (ST 10), entomocidus (ST 239), and finitimus (ST 171). In cases where multiple isolates occurred within a predicted serovar, pesticidal protein profiles were found to be identical, despite the geographic diversity of the isolates. As expected, the dDDH values calculated for pairwise comparisons of the isolates and their apparent corresponding Bt serovar type strains were quite high (> 98%), however dDDH comparisons of the isolates with other serovar type strains were often surprisingly low (< 70%) and suggest unrecognized taxa within Bt and the Bacillus cereus sensu lato.

Peer Review reports


Bacillus thuringiensis (Bt) is a Gram-positive spore forming bacterium that is best known for its use in the management of certain types of insects. The factors responsible for the insecticidal activity of Bt are typically crystalline pesticidal proteins that are produced by the bacteria upon sporulation [1], and it is these protein crystals that are the defining feature of Bt as a species. Here we describe the genomic sequencing and assembly of fourteen Bt isolates which were screened for activity against the Asian citrus psyllid (Diaphorina citri); the results of those assays will be presented in a separate report. These soil isolates were from an extensive collection of Bts from around the world that were characterized entirely on basis of biochemical phenotype and crystal morphology [2]. As the collection is largely uncharacterized genetically, isolates were chosen to represent a diversity of biochemical traits, crystal morphologies, and geographic sources (see Additional file 1). Geographically, the fourteen isolates represented seven states from within the US, as well as Norway, Nepal, Argentina, and Vietnam.

We used the genome assemblies to identify the pesticidal proteins the isolates are predicted to produce, determined their multilocus sequence types within the Bacillus cereus sensu lato, and predicted the placement of the isolates within the classical Bt serovar classification scheme. Finally, we used a whole genome alignment method to better assess the genetic distances between the isolates and the Bt serovars they represent.

Materials and methods

The isolates were grown for 8 h in 20 ml of Luria broth on an orbital shaker at 30˚ C and 200 rpm. Genomic DNA was extracted from bacterial cell pellets using Qiagen’s Purgene Yeast/Bacteria Kit B according to the manufacturer’s gram-positive bacteria protocol. DNA concentrations were determined with a QuantiFluor™-ST Fluorometer (Promega, Madison, WI) using the QuantiFluor™ dsDNA System protocol.

Quality controlled DNA was sequenced using PacBio Sequel II (operated by GENEWIZ, South Plainfield, NJ, USA) and Illumina MiSeq (operated in-house at USDA-ARS Invasive Insect Biocontrol and Behavior Laboratory, Beltsville, MD, USA) instruments. Resulting sequencing reads achieved, as well as NCBI SRA accession identifiers, are presented in Additional file 2. PacBio subreads were filtered and assembled using pbcromwell v1.2.0 (contained in the SMRT Link v10.1.0.119588 software package, downloaded November 4, 2021 from employing default parameters. Assembled contigs were subsequently polished using Illumina short-read data with Pilon v1.24 [3]. Two exceptions to this protocol included the case of isolate IBL02897, for which pbcromwell failed to assemble a 350 kb plasmid encoding multiple pesticidal proteins, although clear evidence for its presence existed among unassembled subreads (Additional file 3). This strain’s subreads were independently assembled using Canu v2.2 [4], the results of which contained the plasmid of interest. The Canu-assembled plasmid was then short-read polished with Pilon and manually added to the isolate’s overall genome assembly file. The second case involved IBL03111 (putative Bt kurstaki), for which sequencing depth plots and comparisons with another apparent Bt kurstaki, isolate IBL01313, indicated that pbcromwell had erroneously incorporated a ~ 330 kb pesticidal protein gene-encoding plasmid into the assembly’s main chromosome (Additional file 3). A substitute assembly was prepared by assembling IBL03111 PacBio subreads with Canu, polishing with Illumina short-read data using Pilon and comparing results with strain IBL01313 using nucmer from the MUMmer v3.23 suite [5]. Contigs lacking evident similarity to IBL01313 were culled from the revised IBL03111 assembly; these were all quite short, ranging from 8,784 to 17,316 bases, and manual inspection suggested they were likely assembly artifacts (Additional file 4). Protein coding gene prediction was performed using Prodigal v2.6.3 [6] with default parameters. Raw read data and polished assemblies are organized under NCBI BioProject PRJNA848845.

Bacillus cereus multilocus sequence types (ST) of isolates were determined with the system devised by Priest et al. [7], using the sequence query utility of PubMLST [8](, which allows sequence types to be determined from whole genome sequence queries. In this system, isolates within a ST share 100% nucleotide sequence identity across PCR amplicons of seven single copy housekeeping genes totaling ca. 2830 nucleotides.

Classically, different types of Bt were distinguished from one another serologically based on antisera developed primarily against flagellar proteins [9]. In this study, prediction of classical serovar status was accomplished by finding Bt serovar type strains with flagellin sequences that matched those predicted for our isolates. Flagellin sequences of our isolates were identified using BLASTp of Prodigal predicted proteins, and the resulting sequences used to query GenBank. Strains of Bt producing identical flagellin proteins were then examined for the presence of serovar type strains.

To gauge phylogenetic distances between the isolates, classical serovar type strains, and relevant species of Bacillus, digital DNA-DNA hybridization (dDDH) values were calculated for pairwise genomic alignments between these groups using the Type Strain Genome Server at DSMZ at This method predicts the results of a classical DNA-DNA hybridization experiment based entirely on genomic sequence data, with values > 70% indicating the genomes being compared belong to the same species [10].

Potential pesticidal proteins that the isolates could produce were identified by performing local BLASTp (BLAST + 2.11.0) [11] on protein sequences predicted by Prodigal, using holotype pesticidal protein sequences obtained from the Bacterial Pesticidal Protein Resource Center ( [12] as queries. Potential pesticidal proteins found in this way were identified using the BestMatchFinder at the Bacterial Pesticidal Protein Resource Center. Pesticidal protein nomenclature follows that of Crickmore et al. [13].

Results and discussion

Genome assembly and gene finding results, as well as NCBI WGS accession identifiers, are presented in Table 1; contig-specific results are shown in Additional File 5. The geographic origins of the isolates, their STs, accession identifiers of their flagellin sequences, putative serovars, and pesticidal protein profiles are summarized in Table 2.

Table 1 Genome assembly descriptive statistics and overall predicted protein coding gene counts
Table 2 Isolate information. Sample location, multilocus sequence type, GenBank accession numbers of predicted flagellar proteins, predicted classical serovar, and predicted pesticidal proteins

The fourteen isolates were found to represent seven distinct B. cereus STs; ST 8, ST 10, ST 16, ST 171, ST 239, ST 240, and ST 550. The PubMLST database includes a list of all isolates that have been reported to share a given ST. For each of the STs listed above, the database included a single serovar type strain of a classical Bt serovar, including Bt kurstaki HD1 (ST 8), Bt thuringiensis BGSC 4A3 (ST 10), Bt israelensis BGSC 4Q1 (ST 16), Bt finitimus BGSC 4B2 (ST 171), Bt entomocidus BGSC 4I4 (ST 239), Bt toumanoffi BGSC 4N1 (ST 240), and Bt pakistani BGSC 4P1 (ST 550). Historically, antisera for Bt serotyping were developed against flagellar preparations, with specificity of these antisera presumably reflecting the high variability of flagellin proteins among Bt varieties. In the current study, the predicted flagellin sequences for all fourteen isolates perfectly matched those of the Bt serovar type strain corresponding to their ST in both number of proteins and sequence identity of those proteins (Table 2). Note that the number of flagellin proteins among serotype strains cited here ranged from one to four, and no single flagellin sequence was shared among different serotype strains. While it is not possible to say that an isolate belongs to a particular serovar without actually performing the serological test, it seems exceedingly likely that these isolates represent the predicted serovars.

Prior studies have demonstrated that a given serovar can exhibit variations in pesticidal protein profiles. Bacillus thuringiensis kurstaki HD-1 and HD-73, both ST 8, have different Cry protein profiles and plasmids [14, 15], and it is known that Bt morrisoni and tenebrionis are the same serotype, but with very different Cry profiles [16]. In this study we found a high degree of correlation between predicted pesticidal protein profile and ST or predicted serovar, despite the geographic diversity represented among the isolates (Table 2). Thus, IBL00090, IBL00171, and IBL02897, representing Bt toumanoffi, had extensive complements of identical pesticidal proteins encoded on contigs of nearly identical size, despite originating from Wyoming, Wisconsin, and Vietnam. Similar results were observed for IBL00055, IBL00144, and IBL00210 representing Bt pakistani from Wyoming, Florida, and Maryland respectively. The four apparent Bt kurstaki isolates were also found to have identical assortments of pesticidal proteins, although the size of the contigs on which they were encoded varied more than was observed for the Bt pakistani or toumanoffi isolates.

While these results might have been expected for Bt kurstaki, which has been spread around the world as a bioinsecticide, it was somewhat surprising for Bt pakistani and toumanoffi, which are presumably not widely dispersed by human activity. We did find that IBL00971, an apparent Bt finitimus, lacks a plasmid previously reported for Bt finitimus YBT-020 that encodes an additional copy of Cry26Aa [17], however the toxin profiles of these isolates were nevertheless identical.

The most surprising results generated in this study were the rather large phylogenetic distances between some of the classical Bt serovars. As expected, dDDH values obtained when comparing isolates with their corresponding serotype strains were very high (> 98%; see Table 3), however, comparisons with other serotype strains often resulted in values well below the 70% minimum expected of conspecific strains. The dDDH calculations obtained suggest that while Bt kurstaki, pakistani, thuringiensis, and entomocidus group with Bacillus cereus ATCC 14579T, Bt israelensis and toumanoffi did not group with B. cereus ATCC 14579T, B. thuringiensis ATCC 10792T, or any other type strains represented in the TYGS database and may represent an unrecognized taxon. Likewise, IBL00971 and Bt finitimus BGSC 4B2 appeared most closely related to the recently described B. paranthracis [18] but produced dDDH values which fell just below the critical 70% cutoff required to be considered examples of that species. A larger phylogenetic analysis based on several of the recent genomic alignment methods might be very useful for reexamining species boundaries within the Bacillus cereus group.

Table 3 Formula d4 dDDH values (%) calculated by comparing genomic sequences of the indicated isolates with those of serovar type strains of selected Bt serovars and selected B. cereus group type strains. Shaded cells indicate dDDH values exceeding 70%


In this report, Bt serovars are predicted based on flagellin sequences of the isolates and those of the original serovar type strains, and not an actual serological test. They are only predictions.

Data Availability

Raw read data and polished assemblies are organized under NCBI BioProject PRJNA848845. Bacillus isolates used in this study are available for research purposes subject to export restrictions.



Alpha-helical pesticidal protein


Basic local alignment search tool

Bt :

Bacillus thuringiensis


Crystal protein


Cytolytic protein


Digital DNA-DNA hybridization


Deutsche Sammlung von Mikroorganismen und Zellkulturen


Mtx2-related pesticidal protein


Polymerase chain reaction


Sequence type


Toxin complex a


Type strain genome server


Vegetative insecticidal protein


Active component of the Vpa/Vpb binary pesticidal protein ​​​​​​​


Binding component of the Vpa/Vpb binary pesticidal protein ​​​​​​​


  1. Höfte H, Whiteley H. Insecticidal crystal proteins of Bacillus thuringiensis. Microbiol Rev. 1989;53(2):242–55.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Martin PA, Gundersen-Rindal DE, Blackburn MB. Distribution of phenotypes among Bacillus thuringiensis strains. Syst Appl Microbiol. 2010;33(4):204–8.

    Article  CAS  PubMed  Google Scholar 

  3. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE. 2014;9(11):e112963.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27(5):722–36.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001;29(22):4633–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11(1):1–1.

    Article  CAS  Google Scholar 

  7. Priest FG, Barker M, Baillie LW, Holmes EC, Maiden MC. Population structure and evolution of the Bacillus cereus group. J Bacteriol. 2004;186(23):7959–70.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Jolley KA, Bray JE, Maiden MC. Open-access bacterial population genomics: BIGSdb software, the website and their applications. Wellcome Open Res. 2018;3.

  9. de Barjac H. Identification of H-serotypes of Bacillus thuringiensis. In: Burgess HD, editor. Microbial Control of Pests and Plant Diseases 1970–1980. London: Academic Press Inc.; 1981. pp. 35–45.

    Google Scholar 

  10. Meier-Kolthoff JP, Auch AF, Klenk HP, Göker M. Genome sequence-based species delimitation with confidence intervals and improved distance functions. BMC Bioinformatics. 2013;14(1):1–4.

    Article  Google Scholar 

  11. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Bio. 1990;215(3):403–10.

    Article  CAS  Google Scholar 

  12. Crickmore N, Berry C, Panneerselvam S, Mishra R, Connor TR, Bonning BC. Bacterial Pesticidal Protein Resource Center, viewed March 2022, Home - Bpprc.

  13. Crickmore N, Berry C, Panneerselvam S, Mishra R, Connor TR, Bonning BC. A structure-based nomenclature for Bacillus thuringiensis and other bacteria-derived pesticidal proteins. J Invertebr Pathol. 2021;186:107438.

    Article  CAS  PubMed  Google Scholar 

  14. Day M, Ibrahim M, Dyer D, Bulla L Jr. Genome sequence of Bacillus thuringiensis subsp. kurstaki strain HD-1. Genome Announc. 2014;2(4):e00613–14.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Liu G, Song L, Shu C, Wang P, Deng C, Peng Q, Lereclus D, Wang X, Huang D, Zhang J, Song F. Complete genome sequence of Bacillus thuringiensis subsp. kurstaki strain HD73. Genome Announc. 2013;1(2):e00080–13.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Joung KB, Côté JC. Phylogenetic analysis of Bacillus thuringiensis serovars based on 16S rRNA gene restriction fragment length polymorphisms. J Appl Microbiol. 2001;90(1):115–22.

    Article  CAS  PubMed  Google Scholar 

  17. Zhu Y, Shang H, Zhu Q, Ji F, Wang P, Fu J, Deng Y, Xu C, Ye W, Zheng J, Zhu L. Complete genome sequence of Bacillus thuringiensis serovar finitimus strain YBT-020. J Bacteriol. 2011;193(9):2379–80.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Liu Y, Du J, Lai Q, Zeng R, Ye D, Xu J, Shao Z. Proposal of nine novel species of the Bacillus cereus group. Int J Syst Evol Microbiol. 2017;67(8):2499–508.

    Article  CAS  PubMed  Google Scholar 

Download references


Ashaki Mitchell and Daniel Kuhar cultured bacteria and isolated genomic DNA for sequencing. The use or mention of a trademark or proprietary product does not constitute an endorsement, guarantee, or warranty of the product and does not imply its approval to the exclusion of other suitable products by the U.S. Department of Agriculture, an equal opportunity employer.


This project was supported by the Citrus Diseases Research and Extension grants program, Award number 2017-70016-26755 and by the Emergency Citrus Diseases Research and Extension grants program, Award number 2020-70029-33177 from the USDA National Institute of Food and Agriculture.

Author information

Authors and Affiliations



M.B. conceptualized the study, wrote the draft version of the manuscript, and performed sequence typing, serovar prediction, and pesticidal protein identification. M.S. performed genome assemblies, gene/protein prediction, and wrote the draft version of the manuscript. R.M. prepared genomic DNA for sequencing and identified pesticidal proteins. B.B. conceptualized the study, and critically reviewed and edited the manuscript.

Corresponding author

Correspondence to Michael B. Blackburn.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.


Additional file 1.pdf. Biochemical phenotype, parasporal crystal morphology, and geographic origin of isolates used in this study. Biochemical traits, crystal morphologies, and geographic sources of isolates.


Additional file 2.pdf. Isolate-specific read sequencing volumes achieved for PacBio Sequel II and Illumina MiSeq instruments. Sequencing volumes achieved for each isolate and NCBI SRA accession identifiers.


Additional file 3.pdf. Errors observed in pbcromwell-based assemblies for IBL03111 and IBL02897. Describes assembly errors that prompted correcting pbcromwell-only assembly results with Canu-generated results.


Additional file 4.txt. Contigs culled from Canu-based IBL03111 assembly. Contigs from IBL03111 lacking similarity to IBL01313 and that may have been assembly artifacts.


Additional file 5.pdf. Contig accession numbers, sizes, and number of predicted proteins. Provides disaggregated, contig-specific assembly and gene finding results.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Blackburn, M.B., Sparks, M.E., Mishra, R. et al. Genomic sequencing of fourteen Bacillus thuringiensis isolates: insights into geographic variation and phylogenetic implications. BMC Res Notes 16, 134 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: