Skip to main content

Table 1 Indexes used and compared in this paper as a distance/similarity measure

From: Absent words and the (dis)similarity analysis of DNA sequences: an experimental study

Index

Comment

Length-weighted index (LWI)

Considered in [1] for only symmetric difference. Here we also use it for set intersection

Jaccard distance

Used in this paper

Total variation distance (TVD)

Used in [2] to analyze similarity on four human genome assemblies

GC content

Used in [2] to analyze similarity on four human genome assemblies. Here we use GC content on symmetric difference, set intersection of MAW sets as well as on RAW sets

Relative absent word (RAW)

Considered in [20] to study Ebola virus genomes against human DNA. Here we use RAW sets for LWI and GC content measures