Main findings
Stage 1 performed reasonably well at predicting surgical complexity
level A, with high sensitivity and NPV, but moderate specificity and
PPV. The intermediate stages 2 and 3 performed poorly for predicting
corresponding surgical complexity levels. Stage 4 had poor PPV for
predicting surgical complexity level D. Pre-determined staging
thresholds performed well at discerning skill level A/B/C versus D
(stage 4) but low specificity for A versus B/C/D and A/B/C versus D
(stages 1, 2 and 3).