Skip to main content

Table 4 Skewness in data distribution inflate overall false alarm rate with the presence of true outliers but with different scale depending on whether IQR, MAD, or FAST-MCD approach is used

From: Input data quality control for NDNQI national comparative statistics and quarterly reports: a contrast of three robust scale estimators for multiple outlier detection

Asymmetry in data distribution Data composition True and false outlier rates by different approach
Preset skewness Planted outliers Simulated observations IQR MAD FAST-MCD
0.000 10   1.000(0.000) 1.000(0.000) 1.000(0.000)
1.000   990 0.028(0.006) 0.029(0.006) 0.076(0.013)
Overall Outliers Detected / (10 + 990) 0.038 0.039 0.085
0.000 10   1.000(0.000) 1.000(0.000) 1.000(0.000)
2.000   990 0.053(0.006) 0.066(0.008) 0.223(0.014)
Overall Outliers Detected / (10 + 990) 0.062 0.075 0.231
0.000 10   1.000(0.000) 1.000(0.000) 1.000(0.000)
3.000   990 0.081(0.006) 0.123(0.008) 0.344(0.012)
Overall Outliers Detected / (10 + 990) 0.090 0.131 0.351
  1. Planted outliers from normal distribution, and simulated observations from Gamma distribution.