Statistical analysis
According to their distribution, continuous variables were reported as the means or medians using interquartile ranges (IQRs) or standard deviations. Categorical variables are reported as percentages. First, the birth weight percentile was calculated using IG-21st software, and the calculation coefficients derived from the WHO study were used to calculate the WHO birth weight percentile. Then, for each growth chart (IG-21st and WHO standards), we calculated the proportion of live births with a birth weight below the <10th percentile (SGA) and <3rd percentile (FGR). To evaluate the relative validity of each reference growth chart, neonatal outcomes (i.e., low Apgar rate, ponderal, and cephalization indexes) between the ”non-overlapping” populations were determined and compared with neonates at or above the 10th percentile using the chi-squared test. Finally, relative risk (RR) was calculated as the ratio of the incidence of adverse perinatal outcomes among SGA and FGR neonates.
To account for a country-specific effect, we further evaluated the association of SGA by different standards with the adverse outcome using multilevel regression analyses, where the subjects were at the lower level and countries at the upper level. The relationships between patient-level and country-level variables and the adverse perinatal outcomes were examined with multilevel linear regression using the R ‘lm4’ package. Fixed effects were estimated for maternal education and nulliparity. The multilevel analysis was implemented in a stepwise manner. First, an unconditional means model was used to determine the attributable variance explained by the multilevel design. Second, using a backward elimination approach, all selected variables for inclusion were added to the unconditional means model as fixed effects, and nonsignificant variables were removed sequentially until only significant (i.e., p<0.05) variables remained. Finally, diagnostic performance (sensitivity; specificity; positive and negative likelihood ratio; and the diagnostic odds ratio) was estimated and used to compare the accuracy of the two fetal growth standards to identify neonates at risk of adverse perinatal outcomes. We compared the likelihood and diagnostic odds ratios by bootstrapping 2000 replicates with replacement. The receiver-operating characteristics (ROC) curve analysis determined the performance for predicting a low APGAR score and ponderal index by each fetal growth standard was determined by the receiver–operating characteristics (ROC) curve analysis. The resulting areas under the ROC curves (AUCs) were compared using the DeLong method, and a p-value <0.05 was considered statistically significant. Data processing was performed using R software. A value of p <0.05 was considered statistically significant.