Skip to main content

Advertisement

Table 1 Indexes used and compared in this paper as a distance/similarity measure

From: Absent words and the (dis)similarity analysis of DNA sequences: an experimental study

Index Comment
Length-weighted index (LWI) Considered in [1] for only symmetric difference. Here we also use it for set intersection
Jaccard distance Used in this paper
Total variation distance (TVD) Used in [2] to analyze similarity on four human genome assemblies
GC content Used in [2] to analyze similarity on four human genome assemblies. Here we use GC content on symmetric difference, set intersection of MAW sets as well as on RAW sets
Relative absent word (RAW) Considered in [20] to study Ebola virus genomes against human DNA. Here we use RAW sets for LWI and GC content measures