Finding Pre-Injury Patterns in Triathletes from Lifestyle, Recovery and Load Dynamics Features

Rossi, Leonardo, Rodrigues, Bruno

arXiv.org Artificial Intelligence 

Embedded Sensing Group ESG Institute of Computer Science in V orarlberg ICV, University of St. Gallen HSG, Switzerland E-mail: leonardo.rossi@student.unisg.ch, Abstract--Triathlon training, which involves high-volume swimming, cycling, and running, places athletes at substantial risk for overuse injuries due to repetitive physiological stress. Current injury prediction approaches primarily rely on training load metrics, often neglecting critical factors such as sleep quality, stress, and individual lifestyle patterns that significantly influence recovery and injury susceptibility. We introduce a novel synthetic data generation framework tailored explicitly for triathlon. This framework generates physiologically plausible athlete profiles, simulates individualized training programs that incorporate periodization and load-management principles, and integrates daily-life factors such as sleep quality, stress levels, and recovery states. We evaluated machine learning models (LASSO, Random Forest, and XGBoost) showing high predictive performance (AUC up to 0.86), identifying sleep disturbances, heart rate variability, and stress as critical early indicators of injury risk. This wearable-driven approach not only enhances injury prediction accuracy but also provides a practical solution to overcoming real-world data limitations, offering a pathway toward a holistic, context-aware athlete monitoring. Triathlon is a demanding multi-sport discipline that combines swimming, cycling, and running.