Goto

Collaborating Authors

 Education


Learning from Interval Targets

arXiv.org Artificial Intelligence

We study the problem of regression with interval targets, where only upper and lower bounds on target values are available in the form of intervals. This problem arises when the exact target label is expensive or impossible to obtain, due to inherent uncertainties. In the absence of exact targets, traditional regression loss functions cannot be used. First, we study the methodology of using a loss functions compatible with interval targets, for which we establish non-asymptotic generalization bounds based on smoothness of the hypothesis class that significantly relaxing prior assumptions of realizability and small ambiguity degree. Second, we propose a novel min-max learning formulation: minimize against the worst-case (maximized) target labels within the provided intervals. The maximization problem in the latter is non-convex, but we show that good performance can be achieved with the incorporation of smoothness constraints. Finally, we perform extensive experiments on real-world datasets and show that our methods achieve state-of-the-art performance.


Can large audio language models understand child stuttering speech? speech summarization, and source separation

arXiv.org Artificial Intelligence

Child speech differs from adult speech in acoustics, prosody, and language development, and disfluencies (repetitions, prolongations, blocks) further challenge Automatic Speech Recognition (ASR) and downstream Natural Language Processing (NLP). Recent large audio-language models (LALMs) demonstrate strong cross-modal audio understanding; however, their behavior in disfluent child speech remains underexplored. We evaluate several state-of-the-art LALMs in two settings: an interview (mixed speakers) and a reading task (single child). The tasks are (i) single-channel source separation to isolate the child and (ii) child-only summarization that preserves clinically relevant disfluencies and avoids adult-speech leakage. Evaluation combines Large Language Model (LLM) as a judge, human expert ratings, and BERTScore (F1), and we report agreement between models and between models and humans to assess reliability. Our findings delineate the conditions under which LALMs produce faithful child-only summaries from mixed audio and where they fail, offering practical guidance for clinical and educational deployments. We provide prompts and evaluation scripts to support replication.


Knowledge Distillation of Uncertainty using Deep Latent Factor Model

arXiv.org Artificial Intelligence

Deep ensembles deliver state-of-the-art, reliable uncertainty quantification, but their heavy computational and memory requirements hinder their practical deployments to real applications such as on-device AI. Knowledge distillation compresses an ensemble into small student models, but existing techniques struggle to preserve uncertainty partly because reducing the size of DNNs typically results in variation reduction. To resolve this limitation, we introduce a new method of distribution distillation (i.e. compressing a teacher ensemble into a student distribution instead of a student ensemble) called Gaussian distillation, which estimates the distribution of a teacher ensemble through a special Gaussian process called the deep latent factor model (DLF) by treating each member of the teacher ensemble as a realization of a certain stochastic process. The mean and covariance functions in the DLF model are estimated stably by using the expectation-maximization (EM) algorithm. By using multiple benchmark datasets, we demonstrate that the proposed Gaussian distillation outperforms existing baselines. In addition, we illustrate that Gaussian distillation works well for fine-tuning of language models and distribution shift problems.


On the Universal Near Optimality of Hedge in Combinatorial Settings

arXiv.org Artificial Intelligence

In this paper, we study the classical Hedge algorithm in combinatorial settings. In each round, the learner selects a vector $\boldsymbol{x}_t$ from a set $X \subseteq \{0,1\}^d$, observes a full loss vector $\boldsymbol{y}_t \in \mathbb{R}^d$, and incurs a loss $\langle \boldsymbol{x}_t, \boldsymbol{y}_t \rangle \in [-1,1]$. This setting captures several important problems, including extensive-form games, resource allocation, $m$-sets, online multitask learning, and shortest-path problems on directed acyclic graphs (DAGs). It is well known that Hedge achieves a regret of $O\big(\sqrt{T \log |X|}\big)$ after $T$ rounds of interaction. In this paper, we ask whether Hedge is optimal across all combinatorial settings. To that end, we show that for any $X \subseteq \{0,1\}^d$, Hedge is near-optimal--specifically, up to a $\sqrt{\log d}$ factor--by establishing a lower bound of $ฮฉ\big(\sqrt{T \log(|X|)/\log d}\big)$ that holds for any algorithm. We then identify a natural class of combinatorial sets--namely, $m$-sets with $\log d \leq m \leq \sqrt{d}$--for which this lower bound is tight, and for which Hedge is provably suboptimal by a factor of exactly $\sqrt{\log d}$. At the same time, we show that Hedge is optimal for online multitask learning, a generalization of the classical $K$-experts problem. Finally, we leverage the near-optimality of Hedge to establish the existence of a near-optimal regularizer for online shortest-path problems in DAGs--a setting that subsumes a broad range of combinatorial domains. Specifically, we show that the classical Online Mirror Descent (OMD) algorithm, when instantiated with the dilated entropy regularizer, is iterate-equivalent to Hedge, and therefore inherits its near-optimal regret guarantees for DAGs.


Preference-driven Knowledge Distillation for Few-shot Node Classification

arXiv.org Artificial Intelligence

Graph neural networks (GNNs) can efficiently process text-attributed graphs (TAGs) due to their message-passing mechanisms, but their training heavily relies on the human-annotated labels. Moreover, the complex and diverse local topologies of nodes of real-world TAGs make it challenging for a single mechanism to handle. Large language models (LLMs) perform well in zero-/few-shot learning on TAGs but suffer from a scalability challenge. Therefore, we propose a preference-driven knowledge distillation (PKD) framework to synergize the complementary strengths of LLMs and various GNNs for few-shot node classification. Specifically, we develop a GNN-preference-driven node selector that effectively promotes prediction distillation from LLMs to teacher GNNs. To further tackle nodes' intricate local topologies, we develop a node-preference-driven GNN selector that identifies the most suitable teacher GNN for each node, thereby facilitating tailored knowledge distillation from teacher GNNs to the student GNN. Extensive experiments validate the efficacy of our proposed framework in few-shot node classification on real-world TAGs. Our code is be available.


Beyond Accuracy: Rethinking Hallucination and Regulatory Response in Generative AI

arXiv.org Artificial Intelligence

Hallucination in generative AI is often treated as a technical failure to produce factually correct output. Yet this framing underrepresents the broader significance of hallucinated content in language models, which may appear fluent, persuasive, and contextually appropriate while conveying distortions that escape conventional accuracy checks. This paper critically examines how regulatory and evaluation frameworks have inherited a narrow view of hallucination, one that prioritises surface verifiability over deeper questions of meaning, influence, and impact. We propose a layered approach to understanding hallucination risks, encompassing epistemic instability, user misdirection, and social-scale effects. Drawing on interdisciplinary sources and examining instruments such as the EU AI Act and the GDPR, we show that current governance models struggle to address hallucination when it manifests as ambiguity, bias reinforcement, or normative convergence. Rather than improving factual precision alone, we argue for regulatory responses that account for languages generative nature, the asymmetries between system and user, and the shifting boundaries between information, persuasion, and harm.


Marcel: A Lightweight and Open-Source Conversational Agent for University Student Support

arXiv.org Artificial Intelligence

We present Marcel, a lightweight and open-source conversational agent designed to support prospective students with admission-related inquiries. The system aims to provide fast and personalized responses, while reducing workload of university staff. We employ retrieval-augmented generation to ground answers in university resources and to provide users with verifiable, contextually relevant information. We introduce a Frequently Asked Question (FAQ) retriever that maps user questions to knowledge-base entries, which allows administrators to steer retrieval, and improves over standard dense/hybrid retrieval strategies. The system is engineered for easy deployment in resource-constrained academic settings. We detail the system architecture, provide a technical evaluation of its components, and report insights from a real-world deployment.


Online Learning for Dynamic Vickrey-Clarke-Groves Mechanism in Unknown Environments

arXiv.org Artificial Intelligence

We consider the problem of online dynamic mechanism design for sequential auctions in unknown environments, where the underlying market and, thus, the bidders' values vary over time as interactions between the seller and the bidders progress. We model the sequential auctions as an infinite-horizon average-reward Markov decision process (MDP). In each round, the seller determines an allocation and sets a payment for each bidder, while each bidder receives a private reward and submits a sealed bid to the seller. The state, which represents the underlying market, evolves according to an unknown transition kernel and the seller's allocation policy without episodic resets. We first extend the Vickrey-Clarke-Groves (VCG) mechanism to sequential auctions, thereby obtaining a dynamic counterpart that preserves the desired properties: efficiency, truthfulness, and individual rationality. We then focus on the online setting and develop a reinforcement learning algorithm for the seller to learn the underlying MDP and implement a mechanism that closely resembles the dynamic VCG mechanism. We show that the learned mechanism approximately satisfies efficiency, truthfulness, and individual rationality and achieves guaranteed performance in terms of various notions of regret.


Learning to Focus: Causal Attention Distillation via Gradient-Guided Token Pruning

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated significant improvements in contextual understanding. However, their ability to attend to truly critical information during long-context reasoning and generation still falls behind the pace. Specifically, our preliminary experiments reveal that certain distracting patterns can misdirect the model's attention during inference, and removing these patterns substantially improves reasoning accuracy and generation quality. We attribute this phenomenon to spurious correlations in the training data, which obstruct the model's capacity to infer authentic causal instruction-response relationships. This phenomenon may induce redundant reasoning processes, potentially resulting in significant inference overhead and, more critically, the generation of erroneous or suboptimal responses. To mitigate this, we introduce a two-stage framework called Learning to Focus (LeaF) leveraging intervention-based inference to disentangle confounding factors. In the first stage, LeaF employs gradient-based comparisons with an advanced teacher to automatically identify confounding tokens based on causal relationships in the training corpus. Then, in the second stage, it prunes these tokens during distillation to enact intervention, aligning the student's attention with the teacher's focus distribution on truly critical context tokens. Experimental results demonstrate that LeaF not only achieves an absolute improvement in various mathematical reasoning, code generation and multi-hop question answering benchmarks but also effectively suppresses attention to confounding tokens during inference, yielding a more interpretable and reliable reasoning model.


A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search

arXiv.org Artificial Intelligence

The fundamental limitation of the behavioral cloning (BC) approach to imitation learning is that it only teaches an agent what the expert did at states the expert visited. This means that when a BC agent makes a mistake which takes them out of the support of the demonstrations, they often don't know how to recover from it. In this sense, BC is akin to giving the agent the fish -- giving them dense supervision across a narrow set of states -- rather than teaching them to fish: to be able to reason independently about achieving the expert's outcome even when faced with unseen situations at test-time. In response, we explore learning to search (L2S) from expert demonstrations, i.e. learning the components required to, at test time, plan to match expert outcomes, even after making a mistake. These include (1) a world model and (2) a reward model. We carefully ablate the set of algorithmic and design decisions required to combine these and other components for stable and sample/interaction-efficient learning of recovery behavior without additional human corrections. Across a dozen visual manipulation tasks from three benchmarks, our approach SAILOR consistently out-performs state-of-the-art Diffusion Policies trained via BC on the same data. Furthermore, scaling up the amount of demonstrations used for BC by 5-10x still leaves a performance gap. We find that SAILOR can identify nuanced failures and is robust to reward hacking. Our code is available at https://github.com/arnavkj1995/SAILOR .