Sensitivity and specificity of simulated reads from draft soil genomes. Simulated reads were constructed using MetaSim  from genomes of four soil bacteria. Herbaspirillum seropedicae and Bacillus mojavensis are species from genera represented in the BLAST databases NR and NT as well as our signature peptide database (SP). Microbacterium trichotecenolytcum represents a genus found in NR and NT but not in SP. Bosea thiooxidans is from a genus not found in any of the three. ( a) Specificity of placement of simulated reads on the reference tree using our method, for 300-bp reads. ( b) Placement of 75-bp reads using our method. ( c) Comparison of sensitivity of our method (top right panel) and MEGAN  using three different BLAST databases: BLASTX and NR (top left) BLASTN and NT (bottom left), and BLASTX against the same genomes used in SP (bottom right). Simulated read lengths of 75, 150, 300, and 600 bp were used for each of the four genomes in each of the four panels. Colors indicate specificity of placement, with gold indicating non-specific placement near the root node in each case. (d) Details of specificity of placement of simulated 150-bp Herbaspirillium seropediacae for the 4 methods: our method (black), MEGAN4 with BLASTX against NR (red), MEGAN4 with BLASTN against NT (green), and MEGAN4 with BLASTX against the same genomes used in SP (cyan).