2025 Research Days
Binghamton Research Days Student Presentations


From Data to Delivery: Predicting Pregnancy Risk with a Boosted Decision Tree

Authors: McKenzie Skrastins, Leemor Waldman

Field of Study: Science, Technology, Engineering, and/or Math

Program Affiliation: CSTEP (Collegiate Science and Technology Entry Program), Louis Stokes Alliances for Minority Participation (LSAMP)

Faculty Mentors: Vladislav Kargin

Timeslot: Midday

Abstract: Maternal health is a critical global priority, with an estimated 213 million pregnancies annually. High-risk pregnancies contribute significantly to maternal and neonatal mortality, particularly in low-income regions. This study aimed to create a statistical learning model that could classify the risk level of pregnancies occurring in Bangladesh. The data used contained 1,014 observations and encompassed key physiological features. Logistic regression achieved 65% accuracy, indicating relationships between features and risk levels. Due to the high bias, low variance nature of the dataset, a decision tree, specifically a boosted decision tree, was used to make the final predictions. A baseline decision tree model with optimized hyperparameters reached an accuracy of 82.3% and a boosted decision tree with tuned hyperparameters yielded a 10-fold cross validated accuracy of 85.81%. These results highlight the usefulness of statistical learning in identifying high-risk pregnancies, and emphasize the need for enhanced prenatal care in socioeconomically disadvantaged regions.