Title
Using Machine Learning and Administrative Data to Predict Premature
Births
Yongjin Choi1,2*, J. Ramon
Gil-Garcia3
1Department of Infectious Disease Epidemiology, London
School of Hygiene and Tropical Medicine, London, United Kingdom
2Laboratory of
Data Discovery for Health Limited (D24H), Hong Kong
Science Park, Hong Kong SAR, China
3Department of Public Administration and Policy,
University at Albany, State University of New York, United States
Abstract
Objective . To assess the potential of using machine learning
and administrative birth data for predicting premature births.
Design . The performance of ordinary least square (OLS) and deep
neural network (DNN) classifiers for predicting low birth weight (LBW)
and preterm birth (PTB) was compared using randomly selected two million
birth records from the US CDC between 2016 and 2018. One million records
from 2016 and 2017 were used to train the classifiers, while another
million records from 2018 were utilized to test them. For hyperparameter
tuning, a grid search with varying numbers of hidden layers, class
weights on positive cases, and thresholds, was undertaken.
Setting and Population : All births in the US
Methods : ordinary least squares regression, deep neural
networks
Main Outcome Measures . LBW (<2,500g) and
PTB(<37 weeks)
Results. The classifiers generally showed high accuracy and specificity,
however, the DNN classifiers showed much improvement in increasing
sensitivity. Based on the results, the highest sensitivity with
comparable specificity was 0.71 for LBW and 0.65 for PTB.
Conclusion . These findings highlight that a ML approach could
benefit PCHV programs by helping identify mothers with a high risk of
premature birth. In particular, the DNN classifiers with administrative
data can provide accessible solutions for public agencies and nonprofit
organizations providing PCHV services that are not likely to possess
massive clinical data or highly accurate genetic testing equipment.
Funding . Faculty Research Awards Programs (FRAP) of University
at Albany
Keywords . machine learning, artificial intelligence,
prediction, birth outcomes, low birth weight, preterm birth,
disadvantaged groups.
* Corresponding author: Yongjin Choi, Department of Infectious Disease
Epidemiology, London School of Hygiene and Tropical Medicine, Keppel St,
London WC1E 7HT, United Kingdom,yongjin.choi@vaccineconfidence.org
Acknowledgments
The authors gratefully acknowledge the support from the Faculty Research
Awards Programs (FRAP) of University at Albany, State University of New
York.
Using Machine Learning and Administrative Data to Predict Premature
Births