🤖 AI Summary
Traditional fetal birth weight prediction models suffer from low feature dimensionality, prevalent missing data, and inadequate modeling of maternal–fetal interactions, resulting in poor generalizability across clinical centers. To address these limitations, we propose a novel preprocessing paradigm integrating Multiple Imputation by Chained Equations (MICE) with tree-based feature selection (XGBoost), followed by an ensemble predictive model combining Bayesian Additive Regression Trees (BART) and gradient boosting regression. This is the first approach to enable interpretable extraction of maternal–fetal physiological determinants from high-dimensional clinical data. It achieves a 23% reduction in mean absolute error (MAE) and demonstrates strong robustness across multi-center real-world datasets. The framework supports precise risk stratification for preterm birth and low birth weight, facilitating individualized perinatal intervention decisions—thereby bridging advanced predictive modeling with clinical perinatal practice.
📝 Abstract
Birth weight serves as a fundamental indicator of neonatal health, closely linked to both early medical interventions and long-term developmental risks. Traditional predictive models, often constrained by limited feature selection and incomplete datasets, struggle to achieve overlooking complex maternal and fetal interactions in diverse clinical settings. This research explores machine learning to address these limitations, utilizing a structured methodology that integrates advanced imputation strategies, supervised feature selection techniques, and predictive modeling. Given the constraints of the dataset, the research strengthens the role of data preprocessing in improving the model performance. Among the various methodologies explored, tree-based feature selection methods demonstrated superior capability in identifying the most relevant predictors, while ensemble-based regression models proved highly effective in capturing non-linear relationships and complex maternal-fetal interactions within the data. Beyond model performance, the study highlights the clinical significance of key physiological determinants, offering insights into maternal and fetal health factors that influence birth weight, offering insights that extend over statistical modeling. By bridging computational intelligence with perinatal research, this work underscores the transformative role of machine learning in enhancing predictive accuracy, refining risk assessment and informing data-driven decision-making in maternal and neonatal care. Keywords: Birth weight prediction, maternal-fetal health, MICE, BART, Gradient Boosting, neonatal outcomes, Clinipredictive.