Skip to main content

Table 2 Composition of Illumina in silico dataset generated with FASTQsim for the DTRA metagenomic algorithm challenge

From: FASTQSim: platform-independent data characterization and in silico read generation for NGS datasets

Organism taxonomy

Number of reads

Number of genes

Bacteria, Proteobacteria, Gammaproteobacteria, Thiotrichales, Francisellaceae, Francisella, tularensis.

206

163

[Genbank:NC_008245.1]

  

Bacteria, Proteobacteria, Alphaproteobacteria, Rhizobiales, Methylobacteriaceae, Methylobacterium

148

110

radiotolerans JCM 2831. [Genbank:CP001001.1]

  

Bacteria, Proteobacteria, Gammaproteobacteria, Pseudomonales, Pseudomonadaceae, Pseudomonas

201

101

aeruginosa pao1. [Genbank:NC_002516.2]

  

Bacteria, Actinobacteria, Actinobacteridae, Actinomycetales, Corynebacterinae, Mycobacteriaceae,

200

111

Mycobacterium avium complex (mac). [Genbank:EU854994.1]

  

Bacteria, Firmicutes, Bacilli, Lactobacillales, Streptococcaceae, Streptococcus pneumoniae ATCC 700669.

201

119

[Genbank:NC_011900.1]

  

Bacteria, Proteobacteria, Gammaproteobacteria, Legionellales, Legionellaceae, Legionella pneumophila

50

37

Philadelphia 1. [Genbank:NC_002942.5]

  

Human immunodeficiency virus I. [Genbank:NC_001802.1]

5

4