Skip to main content

Table 5 Comparison of the alignment-free distances and the benchmark MSA distance for 70 Gammaproteobacteria genomes

From: Comparison of next-generation sequencing samples using compression-based distances and its application to phylogenetic reconstruction

  

d NCD

d

d CDM

CVTree

d 2 S

co-phylog

 

parsimony score

17

18

18

17

18

25

16s rRNA sequences

tree symmetric difference

50

52

52

50

62

108

 

distance correlation

0.93

0.90

0.93

0.92

0.92

0.65

 

parsimony score

22

22

21

21

31

26

Genome sequences

tree symmetric difference

80

78

76

84

110

110

 

distance correlation

0.47

0.46

0.47

0.67

0.50

0.45

 

parsimony score

21

19

23

24

32

28

NGS short reads

tree symmetric difference

90

70

84

88

114

116

 

distance correlation

0.60

0.58

0.53

0.63

0.48

0.42

  1. The NGS short reads were simulated from the whole genome sequences using the Exact model of MetaSim at 1 × sampling depth. The two smallest parsimony scores, the two smallest tree symmetric differences and the two highest correlation coefficients are highlighted in boldface. For CVTree, we used k = 7 for the 16S rRNA data set and k = 12 for the whole genome and NGS data sets. For d 2S, we used k = 6 for the 16S rRNA data set and k = 8 for the whole genome and NGS data sets.