Prediction model training, calibration, and effectiveness
The above statistically significant variables were incorporated into the multivariable logistics regression analysis to identify those variables that were independent risk factors for postoperative severe ALI (Table 4).
[Insert Table 4 here]
Based on the results of the multiple regression analysis, a risk-scoring model was established. The formula used to determine the preoperative risk score was as follows:
logit p = ln [p/(1 − p)] = -2.974 + 1.497 × (CHD) + 0.938 × (CPB duration ≥ 257.5 min) + 0.722 × (LA diameter ≥ 35.5 mm) - 0.814 × (hemoglobin ≤ 139.5 g/L ) + 0.771 × (ICU OI ≤ 100 mmHg) + 0.953 × (LVPWT ≤ 10.5 mm) + 0.869 × ( NEUT ≥ 0.824) + 0.976 × (preCPB OI ≤ 100 mmHg).
The area under the ROC curve of the model was 0.805 (95% CI: 0.746–0.864), and the Hosmer–Lemeshow goodness of fit for the logistic regression model was determined to be significant (χ2 = 6.037, df = 8, P = 0.643) (Figure 1). The model was used to predict which patients should be placed in the validation group and to evaluate its calibration and discrimination. The area under the ROC curve of the validation group was 0.778 (95% CI: 0.667–0.889), and the Hosmer–Lemeshow test showed that the optimal cut-off value was 0.848 (χ2 = 3.3782, df = 7) (Figure 2). The resulting average area of tenfold cross-validation was 0.756 (range, 0.628–0.839), which was very similar to the results produced via assessment of the validation set.
[Insert Figures 1 and 2 here]