From: Absent words and the (dis)similarity analysis of DNA sequences: an experimental study
Index | Comment |
---|---|
Length-weighted index (LWI) | Considered in [1] for only symmetric difference. Here we also use it for set intersection |
Jaccard distance | Used in this paper |
Total variation distance (TVD) | Used in [2] to analyze similarity on four human genome assemblies |
GC content | Used in [2] to analyze similarity on four human genome assemblies. Here we use GC content on symmetric difference, set intersection of MAW sets as well as on RAW sets |
Relative absent word (RAW) | Considered in [20] to study Ebola virus genomes against human DNA. Here we use RAW sets for LWI and GC content measures |