DATA AND METHODS
Data
This study uses a randomly sampled two million records (one million for
training and the rest for testing) of all births in the 50 states, the
District of Columbia, and US territories by using the birth data from
2016 to 2018 provided by the NCHS of the CDC. This database includes
records of all births in the United States and its territories reported
through the birth certificate required by state laws (21). The data
includes the information of infants, sociodemographic characteristics of
parents, and maternal risk factors. The training data is pulled from the
years 2016 and 2017, while using the 2018 data for testing. The study
period is limited to after 2016 because this is the first year that all
the US states have completed the adoption of the 2003 revised birth
certificate (22).
Outcomes and Predictors
This analysis uses LBW (1 if <2,500g) and PTB (1 if
<37 gestational weeks) as the outcome measures of infant
prematurity by following the definitions established by the World Health
Organization (WHO) (23). Table 1 presents the characteristics of infants
in the study sample. Approximately 7% and 9% of infants are coded as
LBW and PTB. More than half (51%) of the sample are male, and the
percentage of singleton births is 97%. The right-most column indicates
that the differences in these characteristics between the training and
testing data are not statistically significant.
[Table 1 about here]
Predictors include various factors that can affect birth outcomes (see
Appendix 2 for the full list of the predictors). Sociodemographic
characteristics of parents are included: maternal age, race, education,
marital status, source of payment. Maternal risk factors include
live-birth order, mother’s body mass index (BMI), weight gain during
pregnancy, smoking, infections (e.g., gonorrhea and syphilis), previous
preterm birth, hypertension eclampsia, diabetes, hypertension, and the
use of infertility treatment.
Approaches for Prediction
This study compares the performance of two approaches: OLS regression
and DNN. The OLS classifier expresses the likelihood of premature birth
as a linear combination of predictors and weighted. The DNN classifier
in this study has a feedforward neural network architecture that
constructs non-linear modules with multiple hidden layers between input
and output layers (Figure 1). These classifiers are fed by 160 inputs
created by 43 variables and produce the probabilities of two classes:
one as premature birth and zero as other (no premature birth).
The DNN classifier optimizes weights of inputs by using gradient descent
optimization that repeatedly computes outputs and errors based on the
given inputs and adjusts weights up to the point where the objective
function is minimized (20).
[Figure 1 about here]
In the DNN classifier, each neuron produces outputs as a linear
combination of given inputs and weights (24). In the input and hidden
layers, the Rectified Linear Unit (ReLU) function adds a non-linear
feature to the outputs in the input and hidden layers. The basic form of
the ReLU function is:
\(f\left(a_{j}\right)=max(0,a_{j})\) [1]
where
\(a_{j}=\sum_{i=1}^{D}{w_{j0}+w_{\text{ji}}x_{i}}\) [2]
In the equations, \(j\) indexes perceptrons and \(D\) denotes the number
of inputs to the perceptron. \(x_{i}\) denotes inputs, while \(w\)denotes weights of the inputs. In each perceptron, this function returns\(a_{j}\) only if it is larger than 0.
The output layer uses a sigmoidal activation function by considering the
binary outcomes. This function returns the output of binary prediction
in the range of 0 to 1. The basic form is:
\(z(a_{j})=\frac{1}{1+exp(a_{j})}\) [3]
In training the DNN classifier, the dataset is divided into ten mutually
exclusive subsets to iterate training and testing with nine subsets
while leaving the last one for performance validation.(25) We set a
large batch size of 1,024 and conduct a grid search with different class
weights on positive cases, hidden layers, and thresholds by considering
that LBW and PTB are rare events of which the occurrence rate is less
than 10%.(26) The number of units is set as 64, and the dropout rate is
set as 0.1.
The model performance is measured by employing three metrics: accuracy,
sensitivity, and specificity. Accuracy refers to the proportion of true
positives and negatives of the sample, i.e. , the proportion of
accurate outputs. Sensitivity, measured as the proportion of true
positive cases among positive predictions, indicates how the model
accurately predicts the outcome. Considering that this study deals with
the imbalance with the low rate of LBW and PTB, getting a good
sensitivity is more crucial than accuracy and specificity to show that
the classifier is sufficiently useful to predict such outcomes that
occasionally happen. Lastly, specificity is calculated as the proportion
of true negatives among negative predictions. This metric indicates how
the classifier predicts births without LBW and PTB accurately.