Imputation Matters: A Deeper Look into an Overlooked Step in Longitudinal Health and Behavior Sensing Research
Choube, Akshat, Majethia, Rahul, Bhattacharya, Sohini, Swain, Vedant Das, Li, Jiachen, Mishra, Varun
–arXiv.org Artificial Intelligence
Longitudinal passive sensing studies for health and behavior outcomes often have missing and incomplete data. Handling missing data effectively is thus a critical data processing and modeling step. Our formative interviews with researchers working in longitudinal health and behavior passive sensing revealed a recurring theme: most researchers consider imputation a low-priority step in their analysis and inference pipeline, opting to use simple and off-the-shelf imputation strategies without comprehensively evaluating its impact on study outcomes. Through this paper, we call attention to the importance of imputation. Using publicly available passive sensing datasets for depression, we show that prioritizing imputation can significantly impact the study outcomes -- with our proposed imputation strategies resulting in up to 31% improvement in AUROC to predict depression over the original imputation strategy. We conclude by discussing the challenges and opportunities with effective imputation in longitudinal sensing studies.
arXiv.org Artificial Intelligence
Dec-8-2024
- Country:
- Asia
- India (0.04)
- Middle East > Iraq
- Muthanna Governorate (0.04)
- Mongolia (0.04)
- Nepal (0.04)
- Europe
- North America > United States
- Florida > Hillsborough County > University (0.05)
- Oceania > New Zealand (0.04)
- South America > Paraguay (0.04)
- Asia
- Genre:
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (1.00)
- Industry:
- Education (1.00)
- Health & Medicine
- Consumer Health (1.00)
- Pharmaceuticals & Biotechnology (0.92)
- Therapeutic Area
- Neurology (1.00)
- Psychiatry/Psychology > Mental Health (0.46)
- Information Technology (0.66)
- Technology:
- Information Technology
- Artificial Intelligence > Machine Learning
- Neural Networks > Deep Learning (0.67)
- Statistical Learning (0.67)
- Communications > Mobile (0.68)
- Data Science > Data Quality (0.90)
- Artificial Intelligence > Machine Learning
- Information Technology