Strengths and Limitations
Our study has multiple strengths. Rather than using a single fetal
weight estimate per participant to construct the growth curve as Hadlock
did,11 our sex-specific standard is based on
longitudinal assessments, with the first EFWs obtained starting at 16
weeks, which is earlier than the Hadlock standard. Our inclusion of only
term births in the derivation of the sex-specific equation removed bias
that would be introduced by the association of preterm birth with poor
growth. Because of this, our sex-specific standard is more
representative of expected fetal growth in ongoing pregnancies. A final
strength is the assessment of differences in clinical management and
outcomes for newborns who were classified differently by the
sex-specific standard than by the sex-neutral standard, which provided
empiric substantiation of the clinical relevance of the differences
between sex-neutral and sex-specific curves.
Our use of a nested cohort to derive and then an expanded cohort to
assess the sex-specific standard is valid because this study is
different from a traditional derivation-validation approach. In such an
approach, separate cohorts are needed because the primary outcome is
used to derive the model, making it invalid to test the model’s
prediction of the same outcome in the same cohort. In our case, this
would be analogous to deriving a fetal growth equation based on its
prediction of morbidity and then testing its prediction of morbidity.
However, our approach was to derive an equation for fetal growth based
on how well it represents available fetal measurements and then assess
how designations based on this fetal growth equation are associated with
clinical outcomes in the parent cohort. Even so, our analyses of
clinical outcomes and management should be interpreted as exploratory
and hypothesis-generating rather than as validating.
The primary limitation of our study is that ultrasound EFWs were not
collected uniformly across gestation, but were instead concentrated
around nuMoM2b study visits such that EFWs collected throughout
pregnancy may better represent expected fetal growth. Additionally, sex
was ascertained at birth, so our sex-specific curve needs to be
validated using a cohort with prenatally identified fetal sex. Further,
we cannot rule out that clinical management based on prenatal suspicion
of FGR may have introduced bias by lowering clinicians’ thresholds for
cesarean delivery. This is plausible, since the group of male newborns
considered SGA by the sex-neutral standard had higher cesarean rates for
fetal compromise but did not experience concrete morbidity more often
than the AGA group. Conversely, clinical action based on prenatal
suspicion for FGR may have prevented morbidity, potentially
underestimating the true rates of perinatal morbidity among newborns
considered SGA by the sex-neutral standard but AGA by the sex-specific
standard. This is a less likely explanation for our findings, however,
since it is implausible that growth-restricted fetuses who undergo
delivery for FGR would have lower rates of morbidity than the AGA group,
which is what we found among female newborns. Unfortunately, information
on prenatal suspicion for LGA or macrosomia was not collected in the
nuMoM2b study so we are unable to determine whether this may have also
altered clinical decisions related to mode of delivery.