Figure 12From: Rapid phylogenetic and functional classification of short genomic fragments with signature peptidesThe signature production process. Approximately 400 million overlapping 10-mers from the 403 bacterial reference genomic sequences are enumerated and collated into a genomic k-mer index. The 5% of this list that appears in multiple genera of bacterial reference genomes are collected, together with the list of leaves (taxa) which contain the signature. Using our inferred phylogeny of reference genomes (provided as Additional file 2, Additional file 3), we use the least common ancestor algorithm to assign the signature to the most specific node that covers all observations of the 10-mer.Back to article page