Goto

Collaborating Authors

 scd


Language Drift in Multilingual Retrieval-Augmented Generation: Characterization and Decoding-Time Mitigation

Li, Bo, Xu, Zhenghua, Xie, Rui

arXiv.org Artificial Intelligence

Multilingual Retrieval-Augmented Generation (RAG) enables large language models (LLMs) to perform knowledge-intensive tasks in multilingual settings by leveraging retrieved documents as external evidence. However, when the retrieved evidence differs in language from the user query and in-context exemplars, the model often exhibits language drift by generating responses in an unintended language. This phenomenon is especially pronounced during reasoning-intensive decoding, such as Chain-of-Thought (CoT) generation, where intermediate steps introduce further language instability. In this paper, we systematically study output language drift in multilingual RAG across multiple datasets, languages, and LLM backbones. Our controlled experiments reveal that the drift results not from comprehension failure but from decoder-level collapse, where dominant token distributions and high-frequency English patterns dominate the intended generation language. We further observe that English serves as a semantic attractor under cross-lingual conditions, emerging as both the strongest interference source and the most frequent fallback language. To mitigate this, we propose Soft Constrained Decoding (SCD), a lightweight, training-free decoding strategy that gently steers generation toward the target language by penalizing non-target-language tokens. SCD is model-agnostic and can be applied to any generation algorithm without modifying the architecture or requiring additional data. Experiments across three multilingual datasets and multiple typologically diverse languages show that SCD consistently improves language alignment and task performance, providing an effective and generalizable solution in multilingual RAG.


A Similarity Measure for Comparing Conversational Dynamics

Jung, Sang Min, Zhang, Kaixiang, Danescu-Niculescu-Mizil, Cristian

arXiv.org Artificial Intelligence

The quality of a conversation goes beyond the individual quality of each reply, and instead emerges from how these combine into interactional dynamics that give the conversation its distinctive overall "shape". However, there is no robust automated method for comparing conversations in terms of their overall dynamics. Such methods could enhance the analysis of conversational data and help evaluate conversational agents more holistically. In this work, we introduce a similarity measure for comparing conversations with respect to their dynamics. We design a validation procedure for testing the robustness of the metric in capturing differences in conversation dynamics and for assessing its sensitivity to the topic of the conversations. To illustrate the measure's utility, we use it to analyze conversational dynamics in a large online community, bringing new insights into the role of situational power in conversations.


On the Energy Distribution of the Galactic Center Excess' Sources

List, Florian, Park, Yujin, Rodd, Nicholas L., Schoen, Eve, Wolf, Florian

arXiv.org Artificial Intelligence

The Galactic Center Excess (GCE) remains one of the defining mysteries uncovered by the Fermi $γ$-ray Space Telescope. Although it may yet herald the discovery of annihilating dark matter, weighing against that conclusion are analyses showing the spatial structure of the emission appears more consistent with a population of dim point sources. Technical limitations have restricted prior analyses to studying the point-source hypothesis purely spatially. All spectral information that could help disentangle the GCE from the complex and uncertain astrophysical emission was discarded. We demonstrate that a neural network-aided simulation-based inference approach can overcome such limitations and thereby confront the point source explanation of the GCE with spatial and spectral data. The addition is profound: energy information drives the putative point sources to be significantly dimmer, indicating either the GCE is truly diffuse in nature or made of an exceptionally large number of sources. Quantitatively, for our best fit background model, the excess is essentially consistent with Poisson emission as predicted by dark matter. If the excess is instead due to point sources, our median prediction is ${\cal O}(10^5)$ sources in the Galactic Center, or more than 35,000 sources at 90% confidence, both significantly larger than the hundreds of sources preferred by earlier point-source analyses of the GCE.


Crowdsourced human-based computational approach for tagging peripheral blood smear sample images from Sickle Cell Disease patients using non-expert users

Rubio, José María Buades, Moyà-Alcover, Gabriel, Jaume-i-Capó, Antoni, Petrović, Nataša

arXiv.org Artificial Intelligence

Supervised machine learning methods rely on tagged training data [1]. The more tagged training data that is available, the more accurately the model can learn to recognize patterns and generalize to unseen data. Crowdsourcing and Human-Based Computation (HBC) has become an increasingly popular approach for acquiring training labels in machine learning classification tasks, as it can be a cost-effective way to share the labeling effort among a large number of annotators. This approach can be particularly useful in cases where expert labeling is expensive or not feasible, or where a large amount of labeled data is needed to train a machine learning model [2]. There exist various tactics for human users to contribute their problem-solving skills [3]: Altruistic contribution: This strategy involves appealing to the altruistic nature of individuals willing to contribute their time and skills to solve problems for the common good [4-6]. Gamification: This strategy involves creating engaging and fun video games incorporating problem-solving tasks [7-9].


Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery

Abyaneh, Amin, Boroujeni, Mahrokh G., Lin, Hsiu-Chin, Ferrari-Trecate, Giancarlo

arXiv.org Machine Learning

Imitation learning is a data-driven approach to learning policies from expert behavior, but it is prone to unreliable outcomes in out-of-sample (OOS) regions. While previous research relying on stable dynamical systems guarantees convergence to a desired state, it often overlooks transient behavior. We propose a framework for learning policies using modeled by contractive dynamical systems, ensuring that all policy rollouts converge regardless of perturbations, and in turn, enable efficient OOS recovery. By leveraging recurrent equilibrium networks and coupling layers, the policy structure guarantees contractivity for any parameter choice, which facilitates unconstrained optimization. Furthermore, we provide theoretical upper bounds for worst-case and expected loss terms, rigorously establishing the reliability of our method in deployment. Empirically, we demonstrate substantial OOS performance improvements in robotics manipulation and navigation tasks in simulation.


Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Medical Image Reconstruction

Barbano, Riccardo, Denker, Alexander, Chung, Hyungjin, Roh, Tae Hoon, Arridge, Simon, Maass, Peter, Jin, Bangti, Ye, Jong Chul

arXiv.org Artificial Intelligence

Denoising diffusion models have emerged as the go-to generative framework for solving inverse problems in imaging. A critical concern regarding these models is their performance on out-of-distribution tasks, which remains an under-explored challenge. Using a diffusion model on an out-of-distribution dataset, realistic reconstructions can be generated, but with hallucinating image features that are uniquely present in the training dataset. To address this discrepancy during train-test time and improve reconstruction accuracy, we introduce a novel sampling framework called Steerable Conditional Diffusion. Specifically, this framework adapts the diffusion model, concurrently with image reconstruction, based solely on the information provided by the available measurement. Utilising our proposed method, we achieve substantial enhancements in out-of-distribution performance across diverse imaging modalities, advancing the robust deployment of denoising diffusion models in real-world applications.


Enhancement of Subjective Content Descriptions by using Human Feedback

Bender, Magnus, Braun, Tanya, Möller, Ralf, Gehrke, Marcel

arXiv.org Artificial Intelligence

An agent providing an information retrieval service may work with a corpus of text documents. The documents in the corpus may contain annotations such as Subjective Content Descriptions (SCD) -- additional data associated with different sentences of the documents. Each SCD is associated with multiple sentences of the corpus and has relations among each other. The agent uses the SCDs to create its answers in response to queries supplied by users. However, the SCD the agent uses might reflect the subjective perspective of another user. Hence, answers may be considered faulty by an agent's user, because the SCDs may not exactly match the perceptions of an agent's user. A naive and very costly approach would be to ask each user to completely create all the SCD themselves. To use existing knowledge, this paper presents ReFrESH, an approach for Relation-preserving Feedback-reliant Enhancement of SCDs by Humans. An agent's user can give feedback about faulty answers to the agent. This feedback is then used by ReFrESH to update the SCDs incrementally. However, human feedback is not always unambiguous. Therefore, this paper additionally presents an approach to decide how to incorporate the feedback and when to update the SCDs. Altogether, SCDs can be updated with human feedback, allowing users to create even more specific SCDs for their needs.


How Did We Get Here? Summarizing Conversation Dynamics

Hua, Yilun, Chernogor, Nicholas, Gu, Yuzhe, Jeong, Seoyeon Julie, Luo, Miranda, Danescu-Niculescu-Mizil, Cristian

arXiv.org Artificial Intelligence

Throughout a conversation, the way participants interact with each other is in constant flux: their tones may change, they may resort to different strategies to convey their points, or they might alter their interaction patterns. An understanding of these dynamics can complement that of the actual facts and opinions discussed, offering a more holistic view of the trajectory of the conversation: how it arrived at its current state and where it is likely heading. In this work, we introduce the task of summarizing the dynamics of conversations, by constructing a dataset of human-written summaries, and exploring several automated baselines. We evaluate whether such summaries can capture the trajectory of conversations via an established downstream task: forecasting whether an ongoing conversation will eventually derail into toxic behavior. We show that they help both humans and automated systems with this forecasting task. Humans make predictions three times faster, and with greater confidence, when reading the summaries than when reading the transcripts. Furthermore, automated forecasting systems are more accurate when constructing, and then predicting based on, summaries of conversation dynamics, compared to directly predicting on the transcripts.


A deep learning framework for jointly extracting spectra and source-count distributions in astronomy

Wolf, Florian, List, Florian, Rodd, Nicholas L., Hahn, Oliver

arXiv.org Artificial Intelligence

Astronomical observations typically provide three-dimensional maps, encoding the distribution of the observed flux in (1) the two angles of the celestial sphere and (2) energy/frequency. An important task regarding such maps is to statistically characterize populations of point sources too dim to be individually detected. As the properties of a single dim source will be poorly constrained, instead one commonly studies the population as a whole, inferring a source-count distribution (SCD) that describes the number density of sources as a function of their brightness. Statistical and machine learning methods for recovering SCDs exist; however, they typically entirely neglect spectral information associated with the energy distribution of the flux. We present a deep learning framework able to jointly reconstruct the spectra of different emission components and the SCD of point-source populations. In a proof-of-concept example, we show that our method accurately extracts even complex-shaped spectra and SCDs from simulated maps.


METER: A Dynamic Concept Adaptation Framework for Online Anomaly Detection

Zhu, Jiaqi, Cai, Shaofeng, Deng, Fang, Ooi, Beng Chin, Zhang, Wenqiao

arXiv.org Artificial Intelligence

Real-time analytics and decision-making require online anomaly detection (OAD) to handle drifts in data streams efficiently and effectively. Unfortunately, existing approaches are often constrained by their limited detection capacity and slow adaptation to evolving data streams, inhibiting their efficacy and efficiency in handling concept drift, which is a major challenge in evolving data streams. In this paper, we introduce METER, a novel dynamic concept adaptation framework that introduces a new paradigm for OAD. METER addresses concept drift by first training a base detection model on historical data to capture recurring central concepts, and then learning to dynamically adapt to new concepts in data streams upon detecting concept drift. Particularly, METER employs a novel dynamic concept adaptation technique that leverages a hypernetwork to dynamically generate the parameter shift of the base detection model, providing a more effective and efficient solution than conventional retraining or fine-tuning approaches. Further, METER incorporates a lightweight drift detection controller, underpinned by evidential deep learning, to support robust and interpretable concept drift detection. We conduct an extensive experimental evaluation, and the results show that METER significantly outperforms existing OAD approaches in various application scenarios.