Disease prediction via Bayesian hyperparameter optimization and ensemble learning

Gao, Liyuan; Ding, Yongmei

doi:10.1186/s13104-020-05050-0

BMC Research Notes

Table 1 Performance indicators of different classifiers on the breast cancer diagnostic and cardiovascular disease datasets

From: Disease prediction via Bayesian hyperparameter optimization and ensemble learning

Classifier	Indicator
Classifier	Accuracy (%)	Precision (%)	Recall (%)	F1 score (%)	AUC	KS Value
XGBoost_BC	94.74 (94.40,1.65)	92.19 (93.10,3.25)	93.65 (91.83,3.68)	92.91 (92.38,2.29)	0.9857 (0.9845,0.76)	0.9061
LightGBM_BC	94.74 (94.05,1.69)	92.19 (92.65,3.49)	93.65 (91.26,3.61)	92.91 (92.00,2.33)	0.9821 (0.9835,0.80)	0.9087
GBDT_BC	94.15 (94.72,1.59)	90.77 (94.41,3.24)	93.65 (92.79,2.21)	92.19 (92.96,2.20)	0.9856 (0.9869,0.66)	0.8968
LR_BC	92.40 (93.64,1.62)	89.06 (92.77,3.15)	90.48 (90.09,3.42)	89.76 (91.25,2.19)	0.9825 (0.9847,0.58)	0.8796
RF_BC	92.40 (91.81,1.94)	90.32 (90.24,3.64)	88.89 (87.77,4.67)	89.60 (88.93,2.78)	0.9710 (0.9757,0.94)	0.8690
BPNN_BC	89.47 (92.86,1.85)	89.47 (91.64,4.18)	80.95 (88.80,3.67)	85.00 (90.23,2.56)	0.9669 (0.9778,0.92)	0.8439
DT_BC	87.72 (90.75,1.83)	90.38 (86.76,3.84)	74.60 (88.53,4.92)	81.74 (87.41,2.80)	0.9314 (0.9500,1.53)	0.6997
XGBoost_CVD	73.50 (73.51,0.27)	75.80 (75.55.0.49)	69.54 (69.53,0.51)	72.54 (72.44,0.30)	0.8044 (0.8023,0.26)	0.4733
LightGBM_CVD	73.53 (73.56,0.26)	75.38 (75.82,0.47)	70.40 (69.17,0.60)	72.81 (72.32,0.32)	0.8042 (0.8023,0.26)	0.4762
GBDT_CVD	73.56 (73.51,0.27)	75.70 (75.60,0.49)	69.90 (69.43,0.58)	72.68 (72.38,0.33)	0.8041 (0.8023,0.25)	0.4746
LR_CVD	72.32 (71.92,0.41)	74.90 (74.02,0.72)	67.69 (67.50,0.54)	71.11 (70.62,0.40)	0.7869 (0.7829,0.38)	0.4503
RF_CVD	73.55 (73.51,0.27)	75.98 (76.02,0.72)	69.39 (68.70,0.60)	72.54 (72.17,0.32)	0.8026 (0.8012,0.26)	0.4717
BPNN_CVD	72.85 (72.81,0.31)	73.07 (73.73,1.11)	72.96 (70.98,1.79)	73.01 (72.28,0.53)	0.7945 (0.7917,0.28)	0.4686
DT_CVD	73.26 (73.12,0.17)	76.45 (76.79,0.86)	67.72 (66.42,1.33)	71.83 (71.22,0.48)	0.7954 (0.7942,0.30)	0.4667

(a) Values in parentheses are the average and standard deviation of the performance indicator values. For the BC dataset, 300 samples were randomly selected from 569 samples each time and repeated 1000 times. For the CVD data set, 1000 samples were randomly selected from 65,535 samples each time and repeated 1000 times.
(b) Italics numbers indicate optimal values

Back to article page

ISSN: 1756-0500

Contact us

Submission enquiries: bmcresearchnotes@biomedcentral.com
General enquiries: ORSupport@springernature.com