Title
Using Machine Learning and Administrative Data to Predict Premature Births
Yongjin Choi1,2*, J. Ramon Gil-Garcia3
1Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
2Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China
3Department of Public Administration and Policy, University at Albany, State University of New York, United States
Abstract
Objective . To assess the potential of using machine learning and administrative birth data for predicting premature births.
Design . The performance of ordinary least square (OLS) and deep neural network (DNN) classifiers for predicting low birth weight (LBW) and preterm birth (PTB) was compared using randomly selected two million birth records from the US CDC between 2016 and 2018. One million records from 2016 and 2017 were used to train the classifiers, while another million records from 2018 were utilized to test them. For hyperparameter tuning, a grid search with varying numbers of hidden layers, class weights on positive cases, and thresholds, was undertaken.
Setting and Population : All births in the US
Methods : ordinary least squares regression, deep neural networks
Main Outcome Measures . LBW (<2,500g) and PTB(<37 weeks)
Results. The classifiers generally showed high accuracy and specificity, however, the DNN classifiers showed much improvement in increasing sensitivity. Based on the results, the highest sensitivity with comparable specificity was 0.71 for LBW and 0.65 for PTB.
Conclusion . These findings highlight that a ML approach could benefit PCHV programs by helping identify mothers with a high risk of premature birth. In particular, the DNN classifiers with administrative data can provide accessible solutions for public agencies and nonprofit organizations providing PCHV services that are not likely to possess massive clinical data or highly accurate genetic testing equipment.
Funding . Faculty Research Awards Programs (FRAP) of University at Albany
Keywords . machine learning, artificial intelligence, prediction, birth outcomes, low birth weight, preterm birth, disadvantaged groups.
* Corresponding author: Yongjin Choi, Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, Keppel St, London WC1E 7HT, United Kingdom,yongjin.choi@vaccineconfidence.org
Acknowledgments
The authors gratefully acknowledge the support from the Faculty Research Awards Programs (FRAP) of University at Albany, State University of New York.
Using Machine Learning and Administrative Data to Predict Premature Births