Skip to main content
Fig. 1 | BMC Research Notes

Fig. 1

From: Teaching reproducible research for medical students and postgraduate pharmaceutical scientists

Fig. 1

Bootstrapped performance metrics used to derive mean estimates and 95% confidence intervals from the original publication [12], from the reproduction using the supplied code written in Python, and from the supplied code translated to R. Average risk reductions were calculated for two subgroups (buckets) of those patients with predicted benefit in absolute risk reduction (ARR  > 0) and those patients without predicted benefit (ARR  ≤ 0). A calibration line was fitted between quintiles of ARRs and predicted risk, whose slope is chosen for this set of performance metrics. As a decision value, the model predicted restricted mean survival time [RMST (days)] indicates the mean time to event if treatment choice would have been based on the predicted individual benefit [and is thus to be compared with the baseline value of 1061.2 days, 95% confidence interval: (1057.4; 1064.1)]. The c-for-benefit is a metric reflecting the model’s ability to predict treatment benefit (rather than risk for an outcome) [20]. Using the Python implementation to calculate this metric, the individual risk estimates reproduced in R yielded an estimate of 0.61 (0.55; 0.72). Of note, we restrict the presentation of results to distributions from resampling and their summary parameters; further numerical metrics to quantify reproducibility are left out for simplicity. Analyses were using the Anaconda distribution of Python version 3.7.3 (Anaconda Software Distribution, version 2–2.4.0) and the R software environment version 3.6.1 (R Foundation for Statistical Computing, Vienna, Austria)

Back to article page