Lucerne
Jesus returns as an AI INFLUENCER who you can video call in real-time - but, there's a catch
For Christians, Easter is a time to remember how Jesus was killed on the cross before returning three days later. But now, around 2,000 years later, the messiah has returned once more - this time as an AI influencer. The AI Jesus chatbot allows you to video call the son of God in real-time from the comfort of your computer. But faithful followers should be warned that there is a fairly major catch. This AI chatbot has been built not only to deliver words of wisdom and comfort, but also to advertise products. Designed as a'satire on spiritual consumerism', the bizarre website's creators say that the AI Jesus will always make sure to suggest a'strangely fitting product'.
Subgroup Performance Analysis in Hidden Stratifications
Bissoto, Alceu, Hoang, Trung-Dung, Flühmann, Tim, Sun, Susu, Baumgartner, Christian F., Koch, Lisa M.
Machine learning (ML) models may suffer from significant performance disparities between patient groups. Identifying such disparities by monitoring performance at a granular level is crucial for safely deploying ML to each patient. Traditional subgroup analysis based on metadata can expose performance disparities only if the available metadata (e.g., patient sex) sufficiently reflects the main reasons for performance variability, which is not common. Subgroup discovery techniques that identify cohesive subgroups based on learned feature representations appear as a potential solution: They could expose hidden stratifications and provide more granular subgroup performance reports. However, subgroup discovery is challenging to evaluate even as a standalone task, as ground truth stratification labels do not exist in real data. Subgroup discovery has thus neither been applied nor evaluated for the application of subgroup performance monitoring. Here, we apply subgroup discovery for performance monitoring in chest x-ray and skin lesion classification. We propose novel evaluation strategies and show that a simplified subgroup discovery method without access to classification labels or metadata can expose larger performance disparities than traditional metadata-based subgroup analysis. We provide the first compelling evidence that subgroup discovery can serve as an important tool for comprehensive performance validation and monitoring of trustworthy AI in medicine.
Knowledge-Augmented Explainable and Interpretable Learning for Anomaly Detection and Diagnosis
Atzmueller, Martin, Bohne, Tim, Windler, Patricia
Knowledge-augmented learning enables the combination of knowledge-based and data-driven approaches. For anomaly detection and diagnosis, understandability is typically an important factor, especially in high-risk areas. Therefore, explainability and interpretability are also major criteria in such contexts. This chapter focuses on knowledge-augmented explainable and interpretable learning to enhance understandability, transparency and ultimately computational sensemaking. We exemplify different approaches and methods in the domains of anomaly detection and diagnosis - from comparatively simple interpretable methods towards more advanced neuro-symbolic approaches.
Towards Foundation Models for Critical Care Time Series
Burger, Manuel, Sergeev, Fedor, Londschien, Malte, Chopard, Daphné, Yèche, Hugo, Gerdes, Eike, Leshetkina, Polina, Morgenroth, Alexander, Babür, Zeynep, Bogojeska, Jasmina, Faltys, Martin, Kuznetsova, Rita, Rätsch, Gunnar
Notable progress has been made in generalist medical large language models across various healthcare areas. However, large-scale modeling of in-hospital time series data - such as vital signs, lab results, and treatments in critical care - remains underexplored. Existing datasets are relatively small, but combining them can enhance patient diversity and improve model robustness. To effectively utilize these combined datasets for large-scale modeling, it is essential to address the distribution shifts caused by varying treatment policies, necessitating the harmonization of treatment variables across the different datasets. This work aims to establish a foundation for training large-scale multi-variate time series models on critical care data and to provide a benchmark for machine learning models in transfer learning across hospitals to study and address distribution shift challenges. We introduce a harmonized dataset for sequence modeling and transfer learning research, representing the first large-scale collection to include core treatment variables. Future plans involve expanding this dataset to support further advancements in transfer learning and the development of scalable, generalizable models for critical healthcare applications.
Church in Switzerland is using an AI-powered Jesus hologram to take confession
Some modern technologies may seem miraculous, but never has that been quite so literal. Thanks to technological advances, worshipers at a church in Switzerland can now speak directly to Jesus - or at least an AI version of him. As part of an art project called'Deus in Machina' (God in a Machine) St Peter's Church in Lucerne has installed an AI-powered Jesus hologram to take confessions. Worshipers simply voice their concerns and questions to get a response from the digitally-rendered face of Jesus Christ. At least two-thirds of people who spoke to AI Jesus came out of the confessional reporting having had a'spiritual' experience.
Navigating Data Scarcity using Foundation Models: A Benchmark of Few-Shot and Zero-Shot Learning Approaches in Medical Imaging
Woerner, Stefano, Baumgartner, Christian F.
Data scarcity is a major limiting factor for applying modern machine learning techniques to clinical tasks. Although sufficient data exists for some well-studied medical tasks, there remains a long tail of clinically relevant tasks with poor data availability. Recently, numerous foundation models have demonstrated high suitability for few-shot learning (FSL) and zero-shot learning (ZSL), potentially making them more accessible to practitioners. However, it remains unclear which foundation model performs best on FSL medical image analysis tasks and what the optimal methods are for learning from limited data. We conducted a comprehensive benchmark study of ZSL and FSL using 16 pretrained foundation models on 19 diverse medical imaging datasets. Our results indicate that BiomedCLIP, a model pretrained exclusively on medical data, performs best on average for very small training set sizes, while very large CLIP models pretrained on LAION-2B perform best with slightly more training samples. However, simply fine-tuning a ResNet-18 pretrained on ImageNet performs similarly with more than five training examples per class. Our findings also highlight the need for further research on foundation models specifically tailored for medical applications and the collection of more datasets to train these models.
Subgroup-Specific Risk-Controlled Dose Estimation in Radiotherapy
Fischer, Paul, Willms, Hannah, Schneider, Moritz, Thorwarth, Daniela, Muehlebach, Michael, Baumgartner, Christian F.
Cancer remains a leading cause of death, highlighting the importance of effective radiotherapy (RT). Magnetic resonance-guided linear accelerators (MR-Linacs) enable imaging during RT, allowing for inter-fraction, and perhaps even intra-fraction, adjustments of treatment plans. However, achieving this requires fast and accurate dose calculations. While Monte Carlo simulations offer accuracy, they are computationally intensive. Deep learning frameworks show promise, yet lack uncertainty quantification crucial for high-risk applications like RT. Risk-controlling prediction sets (RCPS) offer model-agnostic uncertainty quantification with mathematical guarantees. However, we show that naive application of RCPS may lead to only certain subgroups such as the image background being risk-controlled. In this work, we extend RCPS to provide prediction intervals with coverage guarantees for multiple subgroups with unknown subgroup membership at test time. We evaluate our algorithm on real clinical planing volumes from five different anatomical regions and show that our novel subgroup RCPS (SG-RCPS) algorithm leads to prediction intervals that jointly control the risk for multiple subgroups. In particular, our method controls the risk of the crucial voxels along the radiation beam significantly better than conventional RCPS.
Attri-Net: A Globally and Locally Inherently Interpretable Model for Multi-Label Classification Using Class-Specific Counterfactuals
Sun, Susu, Woerner, Stefano, Maier, Andreas, Koch, Lisa M., Baumgartner, Christian F.
Interpretability is crucial for machine learning algorithms in high-stakes medical applications. However, high-performing neural networks typically cannot explain their predictions. Post-hoc explanation methods provide a way to understand neural networks but have been shown to suffer from conceptual problems. Moreover, current research largely focuses on providing local explanations for individual samples rather than global explanations for the model itself. In this paper, we propose Attri-Net, an inherently interpretable model for multi-label classification that provides local and global explanations. Attri-Net first counterfactually generates class-specific attribution maps to highlight the disease evidence, then performs classification with logistic regression classifiers based solely on the attribution maps. Local explanations for each prediction can be obtained by interpreting the attribution maps weighted by the classifiers' weights. Global explanation of whole model can be obtained by jointly considering learned average representations of the attribution maps for each class (called the class centers) and the weights of the linear classifiers. To ensure the model is ``right for the right reason", we further introduce a mechanism to guide the model's explanations to align with human knowledge. Our comprehensive evaluations show that Attri-Net can generate high-quality explanations consistent with clinical knowledge while not sacrificing classification performance.
AI-based Classification of Customer Support Tickets: State of the Art and Implementation with AutoML
One of today's primary priorities of companies is to improve the Customer Experience (CX) to increase customer satisfaction and reduce churn. However, "just 2 percent of organizations reached the top stage of CX maturity [and] most organizations are in early stages of CX maturity" (Dorsey et al., 2022). According to a recent study by Qualtrics (2022), 47 percent of customers ranked support as the second most important area of improvement in CX. One major factor of customer satisfaction identified in recent research (e.g., Service Excellence Research Group, 2021) is the speed at which customer support answers customer inquiries. Demand for customer support is rising and often exceeds the supply of available support agents. Especially missing knowledge and multiple re-routings between support agents are major factors for delays in resolution time. Further research suggests that due to information overload, the quality of decisions decreases with the number of decisions (Hemp, 2009; Viegas et al., 2015). In most recent studies, lack of time and resources are mentioned as the main issues in customer support, which harm the performance and, ultimately, the customer experience (HubSpot, 2022; Serrano et al., 2021).
A comprehensive and easy-to-use multi-domain multi-task medical imaging meta-dataset (MedIMeta)
Woerner, Stefano, Jaques, Arthur, Baumgartner, Christian F.
While the field of medical image analysis has undergone a transformative shift with the integration of machine learning techniques, the main challenge of these techniques is often the scarcity of large, diverse, and well-annotated datasets. Medical images vary in format, size, and other parameters and therefore require extensive preprocessing and standardization, for usage in machine learning. Addressing these challenges, we introduce the Medical Imaging Meta-Dataset (MedIMeta), a novel multi-domain, multi-task meta-dataset. MedIMeta contains 19 medical imaging datasets spanning 10 different domains and encompassing 54 distinct medical tasks, all of which are standardized to the same format and readily usable in PyTorch or other ML frameworks. We perform a technical validation of MedIMeta, demonstrating its utility through fully supervised and cross-domain few-shot learning baselines.