Goto

Collaborating Authors

 Sigurðardóttir, Sigríður


World of ScoreCraft: Novel Multi Scorer Experiment on the Impact of a Decision Support System in Sleep Staging

arXiv.org Artificial Intelligence

Manual scoring of polysomnography (PSG) is a time intensive task, prone to inter scorer variability that can impact diagnostic reliability. This study investigates the integration of decision support systems (DSS) into PSG scoring workflows, focusing on their effects on accuracy, scoring time, and potential biases toward recommendations from artificial intelligence (AI) compared to human generated recommendations. Using a novel online scoring platform, we conducted a repeated measures study with sleep technologists, who scored traditional and self applied PSGs. Participants were occasionally presented with recommendations labeled as either human or AI generated. We found that traditional PSGs tended to be scored slightly more accurately than self applied PSGs, but this difference was not statistically significant. Correct recommendations significantly improved scoring accuracy for both PSG types, while incorrect recommendations reduced accuracy. No significant bias was observed toward or against AI generated recommendations compared to human generated recommendations. These findings highlight the potential of AI to enhance PSG scoring reliability. However, ensuring the accuracy of AI outputs is critical to maximizing its benefits. Future research should explore the long term impacts of DSS on scoring workflows and strategies for integrating AI in clinical practice.


aSAGA: Automatic Sleep Analysis with Gray Areas

arXiv.org Artificial Intelligence

State-of-the-art automatic sleep staging methods have already demonstrated comparable reliability and superior time efficiency to manual sleep staging. However, fully automatic black-box solutions are difficult to adapt into clinical workflow and the interaction between explainable automatic methods and the work of sleep technologists remains underexplored and inadequately conceptualized. Thus, we propose a human-in-the-loop concept for sleep analysis, presenting an automatic sleep staging model (aSAGA), that performs effectively with both clinical polysomnographic recordings and home sleep studies. To validate the model, extensive testing was conducted, employing a preclinical validation approach with three retrospective datasets; open-access, clinical, and research-driven. Furthermore, we validate the utilization of uncertainty mapping to identify ambiguous regions, conceptualized as gray areas, in automatic sleep analysis that warrants manual re-evaluation. The results demonstrate that the automatic sleep analysis achieved a comparable level of agreement with manual analysis across different sleep recording types. Moreover, validation of the gray area concept revealed its potential to enhance sleep staging accuracy and identify areas in the recordings where sleep technologists struggle to reach a consensus. In conclusion, this study introduces and validates a concept from explainable artificial intelligence into sleep medicine and provides the basis for integrating human-in-the-loop automatic sleep staging into clinical workflows, aiming to reduce black-box criticism and the burden associated with manual sleep staging.