Statistical analysis
We compared groups according to study stages and sub-categories of preterm birth using Welch’s One-way Analysis of Variance (ANOVA) or Kruskal-Wallis rank-sum test according to variable distribution, and Mann-Whitney U for term and preterm delivery groups. Categorical variables and frequencies were compared between groups using chi-squared tests. A p-value <0.05 was established as the threshold for statistical significance. For details on variable transformation and imputation for benefit in models’ assumptions, see Supporting Information.
To characterize a cytokine profile in CVF that describes clinical manifestations in labor stages, a multiple principal-component analysis (PCA) was used using cytokine concentrations centered with mean 0 and standard deviation of 1. To choose the components to retain we used the scree plot criterion.
We selected cytokines which better described stages and principal component scores were extracted to develop explanatory models (linear mixed-effects models) to predict pro-inflammatory signaling associated with the onset of labor using stages as independent variable and gestational age (sample collection week) as the moderating variable.
Model diagnostics were conducted using R2 and minimization of the Bayesian Information Criterion (BIC); multicollinearity was assessed using tolerance and Variance Inflation Factor (VIF). Predictors were subjected to homoscedasticity and linearity tests; model diagnostics were performed by assessing normality of the residuals. Model parameters were expressed using β coefficients and 95% CI.
To test IL-6 diagnostic performance for identification of spontaneous labor, we calculated an optimal cut-point using maximization of Youden’s J index, and estimated the respective sensitivity, specificity, predictive values, and likelihood ratios using theOptimalCutpoints R package. In addition, we evaluated their time-varying diagnostic performance using time-dependent ROC curves applying the Kaplan-Meier estimator at different time-points using thetimeROC R package.
Intervals between sampling and delivery were calculated and the estimated cut-point was evaluated by Kaplan–Meier analysis and log-rank test using the survminer R package. Cox regression analysis was used to assess whether IL-6 was associated with risk of delivery at any gestational age. Schoenfeld residuals were used to test the proportional hazards assumption. The predictors were tested on homoscedasticity and linearity assumptions. Finally, a post-estimation simulation of the Cox models was performed to evaluate adjusted hazard ratio estimates across IL-6 values using the simPH R package. All statistical analyses were performed using R statistical software (Version 4.0.2).