Skip to main content

Table 1 Performance indicators of different classifiers on the breast cancer diagnostic and cardiovascular disease datasets

From: Disease prediction via Bayesian hyperparameter optimization and ensemble learning

Classifier

Indicator

Accuracy (%)

Precision (%)

Recall (%)

F1 score (%)

AUC

KS Value

XGBoost_BC

94.74 (94.40,1.65)

92.19 (93.10,3.25)

93.65 (91.83,3.68)

92.91 (92.38,2.29)

0.9857 (0.9845,0.76)

0.9061

LightGBM_BC

94.74 (94.05,1.69)

92.19 (92.65,3.49)

93.65 (91.26,3.61)

92.91 (92.00,2.33)

0.9821 (0.9835,0.80)

0.9087

GBDT_BC

94.15 (94.72,1.59)

90.77 (94.41,3.24)

93.65 (92.79,2.21)

92.19 (92.96,2.20)

0.9856 (0.9869,0.66)

0.8968

LR_BC

92.40 (93.64,1.62)

89.06 (92.77,3.15)

90.48 (90.09,3.42)

89.76 (91.25,2.19)

0.9825 (0.9847,0.58)

0.8796

RF_BC

92.40 (91.81,1.94)

90.32 (90.24,3.64)

88.89 (87.77,4.67)

89.60 (88.93,2.78)

0.9710 (0.9757,0.94)

0.8690

BPNN_BC

89.47 (92.86,1.85)

89.47 (91.64,4.18)

80.95 (88.80,3.67)

85.00 (90.23,2.56)

0.9669 (0.9778,0.92)

0.8439

DT_BC

87.72 (90.75,1.83)

90.38 (86.76,3.84)

74.60 (88.53,4.92)

81.74 (87.41,2.80)

0.9314 (0.9500,1.53)

0.6997

XGBoost_CVD

73.50 (73.51,0.27)

75.80 (75.55.0.49)

69.54 (69.53,0.51)

72.54 (72.44,0.30)

0.8044 (0.8023,0.26)

0.4733

LightGBM_CVD

73.53 (73.56,0.26)

75.38 (75.82,0.47)

70.40 (69.17,0.60)

72.81 (72.32,0.32)

0.8042 (0.8023,0.26)

0.4762

GBDT_CVD

73.56 (73.51,0.27)

75.70 (75.60,0.49)

69.90 (69.43,0.58)

72.68 (72.38,0.33)

0.8041 (0.8023,0.25)

0.4746

LR_CVD

72.32 (71.92,0.41)

74.90 (74.02,0.72)

67.69 (67.50,0.54)

71.11 (70.62,0.40)

0.7869 (0.7829,0.38)

0.4503

RF_CVD

73.55 (73.51,0.27)

75.98 (76.02,0.72)

69.39 (68.70,0.60)

72.54 (72.17,0.32)

0.8026 (0.8012,0.26)

0.4717

BPNN_CVD

72.85 (72.81,0.31)

73.07 (73.73,1.11)

72.96 (70.98,1.79)

73.01 (72.28,0.53)

0.7945 (0.7917,0.28)

0.4686

DT_CVD

73.26 (73.12,0.17)

76.45 (76.79,0.86)

67.72 (66.42,1.33)

71.83 (71.22,0.48)

0.7954 (0.7942,0.30)

0.4667

  1. (a) Values in parentheses are the average and standard deviation of the performance indicator values. For the BC dataset, 300 samples were randomly selected from 569 samples each time and repeated 1000 times. For the CVD data set, 1000 samples were randomly selected from 65,535 samples each time and repeated 1000 times.
  2. (b) Italics numbers indicate optimal values