Utrecht
Veli: Unsupervised Method and Unified Benchmark for Low-Cost Air Quality Sensor Correction
Dalbah, Yahia, Worring, Marcel, Hsu, Yen-Chia
Urban air pollution is a major health crisis causing millions of premature deaths annually, underscoring the urgent need for accurate and scalable monitoring of air quality (AQ). While low-cost sensors (LCS) offer a scalable alternative to expensive reference-grade stations, their readings are affected by drift, calibration errors, and environmental interference. To address these challenges, we introduce Veli (Reference-free Variational Estimation via Latent Inference), an unsupervised Bayesian model that leverages variational inference to correct LCS readings without requiring co-location with reference stations, eliminating a major deployment barrier. Specifically, Veli constructs a disentangled representation of the LCS readings, effectively separating the true pollutant reading from the sensor noise. To build our model and address the lack of standardized benchmarks in AQ monitoring, we also introduce the Air Quality Sensor Data Repository (AQ-SDR). AQ-SDR is the largest AQ sensor benchmark to date, with readings from 23,737 LCS and reference stations across multiple regions. Veli demonstrates strong generalization across both in-distribution and out-of-distribution settings, effectively handling sensor drift and erratic sensor behavior. Code for model and dataset will be made public when this paper is published.
- North America > United States (0.46)
- Europe > Netherlands > North Holland > Amsterdam (0.05)
- Europe > Netherlands > Gelderland > Nijmegen (0.04)
- (7 more...)
Understanding Student Interaction with AI-Powered Next-Step Hints: Strategies and Challenges
Birillo, Anastasiia, Rostovskii, Aleksei, Golubev, Yaroslav, Keuning, Hieke
Automated feedback generation plays a crucial role in enhancing personalized learning experiences in computer science education. Among different types of feedback, next-step hint feedback is particularly important, as it provides students with actionable steps to progress towards solving programming tasks. This study investigates how students interact with an AI-driven next-step hint system in an in-IDE learning environment. We gathered and analyzed a dataset from 34 students solving Kotlin tasks, containing detailed hint interaction logs. We applied process mining techniques and identified 16 common interaction scenarios. Semi-structured interviews with 6 students revealed strategies for managing unhelpful hints, such as adapting partial hints or modifying code to generate variations of the same hint. These findings, combined with our publicly available dataset, offer valuable opportunities for future research and provide key insights into student behavior, helping improve hint design for enhanced learning support.
- North America > United States > Missouri > St. Louis County > St. Louis (0.05)
- Europe > Serbia > Central Serbia > Belgrade (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Research Report > New Finding (0.93)
- Personal > Interview (0.88)
Learning Communication Skills in Multi-task Multi-agent Deep Reinforcement Learning
Zhu, Changxi, Dastani, Mehdi, Wang, Shihan
In multi-agent deep reinforcement learning (MADRL), agents can communicate with one another to perform a task in a coordinated manner. When multiple tasks are involved, agents can also leverage knowledge from one task to improve learning in other tasks. In this paper, we propose Multi-task Communication Skills (MCS), a MADRL with communication method that learns and performs multiple tasks simultaneously, with agents interacting through learnable communication protocols. MCS employs a Transformer encoder to encode task-specific observations into a shared message space, capturing shared communication skills among agents. To enhance coordination among agents, we introduce a prediction network that correlates messages with the actions of sender agents in each task. We adapt three multi-agent benchmark environments to multi-task settings, where the number of agents as well as the observation and action spaces vary across tasks. Experimental results demonstrate that MCS achieves better performance than multi-task MADRL baselines without communication, as well as single-task MADRL baselines with and without communication.
- Europe > Austria > Vienna (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Netherlands > Utrecht (0.04)
- (11 more...)
- Leisure & Entertainment (0.68)
- Information Technology (0.46)
On Improvisation and Open-Endedness: Insights for Experiential AI
Improvisation--the art of spontaneous creation that unfolds moment-to-moment without a scripted outcome--requires practitioners to continuously sense, adapt, and create anew. It is a fundamental mode of human creativity spanning music, dance, and everyday life. The open-ended nature of improvisation produces a stream of novel, unrepeatable moments--an aspect highly valued in artistic creativity. In parallel, open-endedness (OE)--a system's capacity for unbounded novelty and endless "interestingness"--is exemplified in natural or cultural evolution and has been considered "the last grand challenge" in artificial life (ALife). The rise of generative AI now raises the question in computational creativity (CC) research: What makes a "good" improvisation for AI? Can AI learn to improvise in a genuinely open-ended way? In this work-in-progress paper, we report insights from in-depth interviews with 6 experts in improvisation across dance, music, and contact improvisation. We draw systemic connections between human improvisa-tional arts and the design of future experiential AI agents that could improvise alone or alongside humans--or even with other AI agents--embodying qualities of improvisation drawn from practice: active listening (umwelt and awareness), being in the time (mindfulness and ephemerality), embracing the unknown (source of randomness and serendipity), non-judgmental flow (acceptance and dynamical stability, balancing structure and surprise (unpredictable criticality at edge of chaos), imaginative metaphor (synaesthesia and planning), empathy, trust, boundary, and care (mutual theory of mind), and playfulness and intrinsic motivation (maintaining interestingness).
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > China (0.04)
- (7 more...)
- Personal > Interview (0.66)
- Research Report (0.64)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Health & Medicine (1.00)
Scalable LinUCB: Low-Rank Design Matrix Updates for Recommenders with Large Action Spaces
Shustova, Evgenia, Sheshukova, Marina, Samsonov, Sergey, Frolov, Evgeny
Linear contextual bandits, especially LinUCB, are widely used in recommender systems. However, its training, inference, and memory costs grow with feature dimensionality and the size of the action space. The key bottleneck becomes the need to update, invert and store a design matrix that absorbs contextual information from interaction history. In this paper, we introduce Scalable LinUCB, the algorithm that enables fast and memory efficient operations with the inverse regularized design matrix. We achieve this through a dynamical low-rank parametrization of its inverse Cholesky-style factors. We derive numerically stable rank-1 and batched updates that maintain the inverse without directly forming the entire matrix. To control memory growth, we employ a projector-splitting integrator for dynamical low-rank approximation, yielding average per-step update cost $O(dr)$ and memory $O(dr)$ for approximation rank $r$. Inference complexity of the suggested algorithm is $O(dr)$ per action evaluation. Experiments on recommender system datasets demonstrate the effectiveness of our algorithm.
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Asia > Russia (0.04)
- Asia > Singapore (0.04)
- (7 more...)
The Living Forecast: Evolving Day-Ahead Predictions into Intraday Reality
Bölat, Kutay, Palensky, Peter, Tindemans, Simon
Accurate intraday forecasts are essential for power system operations, complementing day-ahead forecasts that gradually lose relevance as new information becomes available. This paper introduces a Bayesian updating mechanism that converts fully probabilistic day-ahead forecasts into intraday forecasts without retraining or re-inference. The approach conditions the Gaussian mixture output of a conditional variational autoencoder-based forecaster on observed measurements, yielding an updated distribution for the remaining horizon that preserves its probabilistic structure. This enables consistent point, quantile, and ensemble forecasts while remaining computationally efficient and suitable for real-time applications. Experiments on household electricity consumption and photovoltaic generation datasets demonstrate that the proposed method improves forecast accuracy up to 25% across likelihood-, sample-, quantile-, and point-based metrics. The largest gains occur in time steps with strong temporal correlation to observed data, and the use of pattern dictionary-based covariance structures further enhances performance. The results highlight a theoretically grounded framework for intraday forecasting in modern power systems.
- Europe > Netherlands > South Holland > Delft (0.04)
- Europe > Netherlands > Utrecht (0.04)
- Energy > Power Industry (1.00)
- Energy > Renewable > Solar (0.88)
- Information Technology > Architecture > Real Time Systems (0.89)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
On the Design and Evaluation of Human-centered Explainable AI Systems: A Systematic Review and Taxonomy
Mangold, Aline, Zietz, Juliane, Weinhold, Susanne, Pannasch, Sebastian
As AI becomes more common in everyday living, there is an increasing demand for intelligent systems that are both performant and understandable. Explainable AI (XAI) systems aim to provide comprehensible explanations of decisions and predictions. At present, however, evaluation processes are rather technical and not sufficiently focused on the needs of human users. Consequently, evaluation studies involving human users can serve as a valuable guide for conducting user studies. This paper presents a comprehensive review of 65 user studies evaluating XAI systems across different domains and application contexts. As a guideline for XAI developers, we provide a holistic overview of the properties of XAI systems and evaluation metrics focused on human users (human-centered). We propose objectives for the human-centered design (design goals) of XAI systems. To incorporate users' specific characteristics, design goals are adapted to users with different levels of AI expertise (AI novices and data experts). In this regard, we provide an extension to existing XAI evaluation and design frameworks. The first part of our results includes the analysis of XAI system characteristics. An important finding is the distinction between the core system and the XAI explanation, which together form the whole system. Further results include the distinction of evaluation metrics into affection towards the system, cognition, usability, interpretability, and explanation metrics. Furthermore, the users, along with their specific characteristics and behavior, can be assessed. For AI novices, the relevant extended design goals include responsible use, acceptance, and usability. For data experts, the focus is performance-oriented and includes human-AI collaboration and system and user task performance.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > Germany > Hamburg (0.04)
- Asia > Middle East > Jordan (0.04)
- (22 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (1.00)
- Information Technology (1.00)
- Health & Medicine (1.00)
- Leisure & Entertainment > Games > Computer Games (0.92)
- Education > Educational Setting (0.67)
What Do Temporal Graph Learning Models Learn?
Hayes, Abigail J., Schumacher, Tobias, Strohmaier, Markus
Learning on temporal graphs has become a central topic in graph representation learning, with numerous benchmarks indicating the strong performance of state-of-the-art models. However, recent work has raised concerns about the reliability of benchmark results, noting issues with commonly used evaluation protocols and the surprising competitiveness of simple heuristics. This contrast raises the question of which properties of the underlying graphs temporal graph learning models actually use to form their predictions. We address this by systematically evaluating seven models on their ability to capture eight fundamental attributes related to the link structure of temporal graphs. These include structural characteristics such as density, temporal patterns such as recency, and edge formation mechanisms such as homophily. Using both synthetic and real-world datasets, we analyze how well models learn these attributes. Our findings reveal a mixed picture: models capture some attributes well but fail to reproduce others. With this, we expose important limitations. Overall, we believe that our results provide practical insights for the application of temporal graph learning models, and motivate more interpretability-driven evaluations in temporal graph learning research.
- North America > United States > California (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > Singapore (0.04)
- (6 more...)
From Discord to Harmony: Decomposed Consonance-based Training for Improved Audio Chord Estimation
Poltronieri, Andrea, Serra, Xavier, Rocamora, Martín
Audio Chord Estimation (ACE) holds a pivotal role in music information research, having garnered attention for over two decades due to its relevance for music transcription and analysis. Despite notable advancements, challenges persist in the task, particularly concerning unique characteristics of harmonic content, which have resulted in existing systems' performances reaching a glass ceiling. These challenges include annotator subjectivity, where varying interpretations among annotators lead to inconsistencies, and class imbalance within chord datasets, where certain chord classes are over-represented compared to others, posing difficulties in model training and evaluation. As a first contribution, this paper presents an evaluation of inter-annotator agreement in chord annotations, using metrics that extend beyond traditional binary measures. In addition, we propose a consonance-informed distance metric that reflects the perceptual similarity between harmonic annotations. Our analysis suggests that consonance-based distance metrics more effectively capture musically meaningful agreement between annotations. Expanding on these findings, we introduce a novel ACE conformer-based model that integrates consonance concepts into the model through consonance-based label smoothing. The proposed model also addresses class imbalance by separately estimating root, bass, and all note activations, enabling the reconstruction of chord labels from decomposed outputs.
- Europe > Netherlands > South Holland > Delft (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > Puerto Rico > Peñuelas > Peñuelas (0.04)
- (15 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
CaTE Data Curation for Trustworthy AI
Clemens-Sewall, Mary Versa, Cervantes, Christopher, Rafkin, Emma, Otte, J. Neil, Magelinski, Tom, Lewis, Libby, Liu, Michelle, Udwin, Dana, Kirkman-Bey, Monique
This report provides practical guidance to teams designing or developing AI-enabled systems for how to promote trustworthiness during the data curation phase of development. In this report, the authors first define data, the data curation phase, and trustworthiness. We then describe a series of steps that the development team, especially data scientists, can take to build a trustworthy AI-enabled system. We enumerate the sequence of core steps and trace parallel paths where alternatives exist. The descriptions of these steps include strengths, weaknesses, preconditions, outcomes, and relevant open-source software tool implementations. In total, this report is a synthesis of data curation tools and approaches from relevant academic literature, and our goal is to equip readers with a diverse yet coherent set of practices for improving AI trustworthiness.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Washington > King County > Seattle (0.13)
- North America > United States > California > San Francisco County > San Francisco (0.13)
- (27 more...)
- Overview (1.00)
- Research Report > Experimental Study (0.46)
- Instructional Material > Course Syllabus & Notes (0.45)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
- Transportation (0.92)
- (4 more...)