🤖 AI Summary
Existing triathlon injury prediction models over-rely on training load metrics while neglecting critical recovery factors—such as sleep quality, psychological stress, and lifestyle behaviors. To address the scarcity of labeled real-world data, this study proposes a triathlon-specific synthetic data generation framework that integrates personalized training periodization with dynamic, multi-source physiological and psychological features—including heart rate variability (HRV), sleep efficiency, and subjective stress ratings. Leveraging this synthesized dataset, we develop a context-aware injury risk prediction model using LASSO regression, Random Forest, and XGBoost. Driven by multidimensional wearable sensor data, the model achieves an AUC of 0.86, demonstrating that sleep disturbances, reduced HRV, and elevated psychological stress serve as robust, early pre-injury biomarkers. This work establishes the first generalizable injury risk modeling paradigm in sports medicine that holistically integrates lifestyle, recovery, and training load dynamics.
📝 Abstract
Triathlon training, which involves high-volume swimming, cycling, and running, places athletes at substantial risk for overuse injuries due to repetitive physiological stress. Current injury prediction approaches primarily rely on training load metrics, often neglecting critical factors such as sleep quality, stress, and individual lifestyle patterns that significantly influence recovery and injury susceptibility.
We introduce a novel synthetic data generation framework tailored explicitly for triathlon. This framework generates physiologically plausible athlete profiles, simulates individualized training programs that incorporate periodization and load-management principles, and integrates daily-life factors such as sleep quality, stress levels, and recovery states. We evaluated machine learning models (LASSO, Random Forest, and XGBoost) showing high predictive performance (AUC up to 0.86), identifying sleep disturbances, heart rate variability, and stress as critical early indicators of injury risk. This wearable-driven approach not only enhances injury prediction accuracy but also provides a practical solution to overcoming real-world data limitations, offering a pathway toward a holistic, context-aware athlete monitoring.