DATA AND METHODS

Data

This study uses a randomly sampled two million records (one million for training and the rest for testing) of all births in the 50 states, the District of Columbia, and US territories by using the birth data from 2016 to 2018 provided by the NCHS of the CDC. This database includes records of all births in the United States and its territories reported through the birth certificate required by state laws (21). The data includes the information of infants, sociodemographic characteristics of parents, and maternal risk factors. The training data is pulled from the years 2016 and 2017, while using the 2018 data for testing. The study period is limited to after 2016 because this is the first year that all the US states have completed the adoption of the 2003 revised birth certificate (22).

Outcomes and Predictors

This analysis uses LBW (1 if <2,500g) and PTB (1 if <37 gestational weeks) as the outcome measures of infant prematurity by following the definitions established by the World Health Organization (WHO) (23). Table 1 presents the characteristics of infants in the study sample. Approximately 7% and 9% of infants are coded as LBW and PTB. More than half (51%) of the sample are male, and the percentage of singleton births is 97%. The right-most column indicates that the differences in these characteristics between the training and testing data are not statistically significant.
[Table 1 about here]
Predictors include various factors that can affect birth outcomes (see Appendix 2 for the full list of the predictors). Sociodemographic characteristics of parents are included: maternal age, race, education, marital status, source of payment. Maternal risk factors include live-birth order, mother’s body mass index (BMI), weight gain during pregnancy, smoking, infections (e.g., gonorrhea and syphilis), previous preterm birth, hypertension eclampsia, diabetes, hypertension, and the use of infertility treatment.

Approaches for Prediction

This study compares the performance of two approaches: OLS regression and DNN. The OLS classifier expresses the likelihood of premature birth as a linear combination of predictors and weighted. The DNN classifier in this study has a feedforward neural network architecture that constructs non-linear modules with multiple hidden layers between input and output layers (Figure 1). These classifiers are fed by 160 inputs created by 43 variables and produce the probabilities of two classes: one as premature birth and zero as other (no premature birth).
The DNN classifier optimizes weights of inputs by using gradient descent optimization that repeatedly computes outputs and errors based on the given inputs and adjusts weights up to the point where the objective function is minimized (20).
[Figure 1 about here]
In the DNN classifier, each neuron produces outputs as a linear combination of given inputs and weights (24). In the input and hidden layers, the Rectified Linear Unit (ReLU) function adds a non-linear feature to the outputs in the input and hidden layers. The basic form of the ReLU function is:
\(f\left(a_{j}\right)=max(0,a_{j})\) [1]
where
\(a_{j}=\sum_{i=1}^{D}{w_{j0}+w_{\text{ji}}x_{i}}\) [2]
In the equations, \(j\) indexes perceptrons and \(D\) denotes the number of inputs to the perceptron. \(x_{i}\) denotes inputs, while \(w\)denotes weights of the inputs. In each perceptron, this function returns\(a_{j}\) only if it is larger than 0.
The output layer uses a sigmoidal activation function by considering the binary outcomes. This function returns the output of binary prediction in the range of 0 to 1. The basic form is:
\(z(a_{j})=\frac{1}{1+exp(a_{j})}\) [3]
In training the DNN classifier, the dataset is divided into ten mutually exclusive subsets to iterate training and testing with nine subsets while leaving the last one for performance validation.(25) We set a large batch size of 1,024 and conduct a grid search with different class weights on positive cases, hidden layers, and thresholds by considering that LBW and PTB are rare events of which the occurrence rate is less than 10%.(26) The number of units is set as 64, and the dropout rate is set as 0.1.
The model performance is measured by employing three metrics: accuracy, sensitivity, and specificity. Accuracy refers to the proportion of true positives and negatives of the sample, i.e. , the proportion of accurate outputs. Sensitivity, measured as the proportion of true positive cases among positive predictions, indicates how the model accurately predicts the outcome. Considering that this study deals with the imbalance with the low rate of LBW and PTB, getting a good sensitivity is more crucial than accuracy and specificity to show that the classifier is sufficiently useful to predict such outcomes that occasionally happen. Lastly, specificity is calculated as the proportion of true negatives among negative predictions. This metric indicates how the classifier predicts births without LBW and PTB accurately.