Table 1 Dataset summary

From: Fast accurate missing SNP genotype local imputation

Dataset: human mouse hd cattle ld cattle
Physical length 79 mbp 95 mbp 76 mbp 76 mbp
#samples 83 15 64 469
#SNPs 40,755 288,229 22,266 1,508
Missing genotype rate 0.268% 11.1% 2.224% 0.078%
#complete SNPs 34,071 144,820 5,487 1,271
  1. The summary of the four datasets we used in the simulation studies. These four datasets are all on chromosome 17, for human, mouse and cattle. The “physical length” refers to the chromosome length, in million basepairs (mbp). “#samples” is the number of individuals genotyped in the dataset, “#SNPs” is the number of SNP markers in the original dataset, and “missing genotype rate” refers to the percentage of missing genotype values in the original dataset. “#complete SNPs” is the number of SNP markers at which all samples have genotype values. All the other SNP markers were removed, leaving a complete sub-dataset to be used in the simulation studies.