Skip to main content

Draft genome sequence of “Candidatus Afipia apatlaquensis” sp. nov., IBT-C3, a potential strain for decolorization of textile dyes



In order to characterize a river-associated, enriched microbiome capable of degrading an anthraquinone dye from the oil blue family, as well as assessing its functional potential, we performed a taxa-specific metagenomic deconvolution analysis based on contact probability maps at the chromosomal level. This study will allow associating the genomic content of “Candidatus Afipia apatlaquensis” strain IBT-C3 with its phenotypic potential in the context of bioremediation of textile dyes. We anticipate that this resource will be very useful in comparative genomic clinical studies, contributing to understanding the genomic basis of Afipia pathogenicity.

Data description

Here, we report the first draft genome sequence of “Candidatus Afipia apatlaquensis” sp. nov., strain IBT-C3, obtained by deconvolution of a textile-dye degrader microbiome in Mexico. The genome composite was deconvoluted using a Hi-C proximity ligation method. Whole-genome-based comparisons and phylogenomics reconstruction indicate that strain IBT-C3 represents a new species of the genus Afipia. The assembly completeness was 92.5% with 5,604,749 bp in length and 60.72% G+C content. The genome complement of IBT-C3 suggests a functional potential for decolorization of textile dyes, contrasting with previous reports of Afipia genus focused on its pathogenic potential.


Afipia is a bacterial genus clustered in the Bradyrhizobiaceae family of the Proteobacteria phylum. Afipia species are widely known as human and animal pathogens and have been isolated from human sources or hospital water supplies. They are also considered amoeba-resisting bacteria as they can be recovered from amoebal coculture in domestic water systems, a trait related to his capacity for causing nosocomial infections. Five species of the genus have standing in nomenclature and other genospecies groups have been described in the literature [1,2,3,4]. At the time of writing, GenBank contains 27 assemblies of this genus, five of which are derived from type material. However, the majority of this taxon has been explored in a clinic context. In this study, we have deconvoluted a genome composite of a new species of Afipia from a mixed culture enriched with an anthraquinonic textile dye. This project aims to explore how nutritional selection by textile dyes influences microbial communities in bodies of water in the state of Morelos, Mexico. Based on genomic relationship criteria, we have named this novel taxon “Candidatus Afipia apatlaquensis” sp. nov., in line with its coherence inside the Afipia genus, but clear distinctiveness among the species already described. Analysis of the annotated genome suggests plausible molecular functions related to textile dye degradation. This is the first report of an Afipia genetic resource assembled from an enriched river sediment biome in a textile dye bioremediation context. Our long-term goal is to reproduce patterns of microbial dynamics that shed light on how microorganisms respond to pollution generated by the textile industry. We anticipate that this resource will be very useful in comparative genomic studies contributing to the understanding of the genomic bases that modulates environmental or pathogenic behaviors in Afipia.

Data description

In order to explore the microbial diversity present in a highly polluted area of the Apatlaco river basin located in Morelos, México, four samples of sediments and surface water were taken (sites P1: − 99.26872, 18.97372, P7: − 99.2187, 18.83, P10: − 99.23337, 18.78971 and P17: − 99.18278, 18.60914) and processed as described Bretón-Deval et al. [5]. One composite sample was enriched in the laboratory with 200 mg mL−1 of an anthraquinone dye (Deep-Blue 35™, obtained from Monroe Chemical Company de México, S.A. de C.V, in his national commercial form) from the oil blue family. The enriched sample was incubated for 30 days at room temperature in a 10 L polyethylene batch reactor. Ten grams of the sedimented sludge in the reactor were extracted and directly crosslinked according to [6]. The sample was sent to massively parallel sequencing, proximity ligation (Hi-C) and deconvolution services from Phase Genomics, Inc. company (Seattle, USA). The sequencing of the DNA libraries yielded 11.7 Gb of pair-end reads (Data file 6) [7]. Postprocessing of the Hi-C short reads involved trimming with Trimmomatic V 0.39 [8]. The total input reads were: 144,014,062, surviving: 141,561,527 (98.30%); and clustering reads over a previous draft assembly with ProxiMeta software [6]. CheckM V 1.0.11 was used to assess genome quality stats [9]. The “Candidatus Afipia apatlaquensis” sp. nov., genome composite was submitted to GenBank under the BioProject: PRJNA606950 ( (Data file 1) [10] and was annotated with the National Center for Biotechnology Information (NCBI) Prokaryotic Annotation Pipeline [11]. In addition, the metagenome-assembled genome was annotated with KofamKOALA tool [12] in order to assign KEGG Orthologs (KO) related to decolorization of textile dyes (Data file 3) [13]. Genomic taxonomy was performed by analysis of overall genome relation indexes with Average Nucleotide Identity, Mash distance determination (Data files 2 and 4) [14, 15] and phylogenomic reconstruction with the Up-to-date bacterial core gene set (UBCG) tools (Data file 5) [16]. Table 1 presents data repositories and links for genome assembly and annotations, taxonomic descriptions and whole-genome sequence analysis.

Table 1 Overview of data files


The reported “Candidatus Afipia apatlaquensis” sp. nov., genome composite, was assembled from a mixed sample. However, in order to reduce bias, we apply a novel methodology that involves covalent association of nearby sequences intrachromosomally, which ensures sequences belonging to the same cell could be grouped by a physical signal.

Availability of data and materials

The data described in this Data note can be freely and openly accessed on [10]. Data files 2 to 5 are freely accessible on Figshare ( Please see Table 1 and references [7, 10, 13,14,15,16] for details and links to the data.



Enzyme commission number


Guanine–Cytosine content


A methodology for study tridimensional structure of chromosomes


Nucleotide subsequences of length k


KEGG orthology identifier


  1. 1.

    Brenner DJ, Hollis DG, Moss CW, English CK, Hall GS, Vincent J, et al. Proposal of Afipia gen. nov., with Afipia felis sp. nov. (Formerly the cat scratch disease bacillus), Afipia clevelandensis sp. nov. (Formerly the Cleveland Clinic Foundation Strain), Afipia broomeae sp. nov., and three unnamed genospecies. J Clin Microbiol. 1991.

    Article  PubMed  PubMed Central  Google Scholar 

  2. 2.

    La Scola B, Barrassi L, Raoult D. Isolation of new fastidious α Proteobacteria and Afipia felis from hospital water supplies by direct plating and amoebal co-culture procedures. FEMS Microbiol Ecol. 2000.

    Article  PubMed  Google Scholar 

  3. 3.

    La Scola B, Mallet MN, Grimont PAD, Raoult D. Description of Afipia birgiae sp. nov. and Afipia massiliensis sp. nov. and recognition of Afipia felis genospecies A. Int J Syst Evol Microbiol. 2002.

    Article  PubMed  Google Scholar 

  4. 4.

    Thomas V, Herrera-Rimann K, Blanc DS, Greub G. Biodiversity of amoebae and amoeba-resisting bacteria in a hospital water network. Appl Environ Microbiol. 2006.

    Article  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Breton-Deval L, Sanchez-Flores A, Juárez K, Vera-Estrella R. Integrative study of microbial community dynamics and water quality along The Apatlaco River. Environ Pollut. 2019.

    Article  PubMed  Google Scholar 

  6. 6.

    Press MO, Wiser AH, Kronenberg ZN, Langford KW, Shakya M, Lo C-C, et al. Hi-C deconvolution of a human gut microbiome yields high-quality draft genomes and reveals plasmid-genome interactions. bioRxiv. 2017.

    Article  Google Scholar 

  7. 7.

    National Center for Biotechnology Information. US National Library of Medicine, Rockville Pike. 2020. Accessed 15 Apr 2020.

  8. 8.

    Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014.

    Article  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from. Genome Res. 2015.

    Article  PubMed  PubMed Central  Google Scholar 

  10. 10.

    National Center for Biotechnology Information. US National Library of Medicine, Rockville Pike. 2020. Accessed 28 Mar 2020.

  11. 11.

    Tatusova T, Dicuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, et al. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res. 2016.

    Article  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Aramaki T, Blanc-Mathieu R, Endo H, Ohkubo K, Kanehisa M, Goto S, et al. KofamKOALA: KEGG ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics. 2019.

    Article  PubMed Central  Google Scholar 

  13. 13.

    Sánchez-Reyes A, Bretón-Deval L, Mangelson H, Sanchez-Flores A. 2020. Data File 3_Enzymes.pdf. Figshare.

  14. 14.

    Sánchez-Reyes A, Bretón-Deval L, Mangelson H, Sanchez-Flores A. 2020. Data File 2_Candidatus Afipia apatlaquensis.OGRI.pdf. Figshare.

  15. 15.

    Sánchez-Reyes A, Bretón-Deval L, Mangelson H, Sanchez-Flores A. 2020. Data File 4_Candidatus Afipia apatlaquensis-Description and Features.pdf. Figshare.

  16. 16.

    Sánchez-Reyes A, Bretón-Deval L, Mangelson H, Sanchez-Flores A. 2020. Data File 5_Candidatus Afipia apatlaquensis. Phylogenomic analysis.tiff. Figshare.

Download references


The authors thanks UNAM-IBT, for financial support to this study. ASR thank to the program CATEDRAS CONACYT from the Consejo Nacional de Ciencia y Tecnología, Mexico, for supporting the Project 237. We also like to thank the Unidad Universitaria de Secuenciación Masiva y Bioinformática (UUSMB) of the Instituto de Biotecnología, UNAM, where initial raw data were analyzed.


Not applicable

Author information




ASR was involved in the conceptualization of the study, taxonomy analysis, manuscript writing and editing; LBD collected the samples employed in the study and was involved in the metagenomic analysis. HM performed the HI-C library and carried out the deconvolution analysis of the reported genome. ASF performed the draft shotgun assembly and was involved in editing the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Ayixon Sánchez-Reyes.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sánchez-Reyes, A., Bretón-Deval, L., Mangelson, H. et al. Draft genome sequence of “Candidatus Afipia apatlaquensis” sp. nov., IBT-C3, a potential strain for decolorization of textile dyes. BMC Res Notes 13, 265 (2020).

Download citation


  • Candidatus Afipia apatlaquensis” sp. nov.
  • Strain IBT-C3
  • Textile dye decolorization
  • Metagenomic deconvolution