Goto

Collaborating Authors

 Education


IDALC: A Semi-Supervised Framework for Intent Detection and Active Learning based Correction

arXiv.org Artificial Intelligence

Voice-controlled dialog systems have become immensely popular due to their ability to perform a wide range of actions in response to diverse user queries. These agents possess a predefined set of skills or intents to fulfill specific user tasks. But every system has its own limitations. There are instances where, even for known intents, if any model exhibits low confidence, it results in rejection of utterances that necessitate manual annotation. Additionally, as time progresses, there may be a need to retrain these agents with new intents from the system-rejected queries to carry out additional tasks. Labeling all these emerging intents and rejected utterances over time is impractical, thus calling for an efficient mechanism to reduce annotation costs. In this paper, we introduce IDALC (Intent Detection and Active Learning based Correction), a semi-supervised framework designed to detect user intents and rectify system-rejected utterances while minimizing the need for human annotation. Empirical findings on various benchmark datasets demonstrate that our system surpasses baseline methods, achieving a 5-10% higher accuracy and a 4-8% improvement in macro-F1. Remarkably, we maintain the overall annotation cost at just 6-10% of the unlabelled data available to the system. The overall framework of IDALC is shown in Fig. 1


Adaptation and Fine-tuning with TabPFN for Travelling Salesman Problem

arXiv.org Artificial Intelligence

Tabular Prior-Data Fitted Network (TabPFN) is a foundation model designed for small to medium-sized tabular data, which has attracted much attention recently. This paper investigates the application of TabPFN in Combinatorial Optimization (CO) problems. The aim is to lessen challenges in time and data-intensive training requirements often observed in using traditional methods including exact and heuristic algorithms, Machine Learning (ML)-based models, to solve CO problems. Proposing possibly the first ever application of TabPFN for such a purpose, we adapt and fine-tune the TabPFN model to solve the Travelling Salesman Problem (TSP), one of the most well-known CO problems. Specifically, we adopt the node-based approach and the node-predicting adaptation strategy to construct the entire TSP route. Our evaluation with varying instance sizes confirms that TabPFN requires minimal training, adapts to TSP using a single sample, performs better generalization across varying TSP instance sizes, and reduces performance degradation. Furthermore, the training process with adaptation and fine-tuning is completed within minutes. The methodology leads to strong solution quality even without post-processing and achieves performance comparable to other models with post-processing refinement. Our findings suggest that the TabPFN model is a promising approach to solve structured and CO problems efficiently under training resource constraints and rapid deployment requirements.


Can a Small Model Learn to Look Before It Leaps? Dynamic Learning and Proactive Correction for Hallucination Detection

arXiv.org Artificial Intelligence

Hallucination in large language models (LLMs) remains a critical barrier to their safe deployment. Existing tool-augmented hallucination detection methods require pre-defined fixed verification strategies, which are crucial to the quality and effectiveness of tool calls. Some methods directly employ powerful closed-source LLMs such as GPT-4 as detectors, which are effective but too costly. To mitigate the cost issue, some methods adopt the teacher-student architecture and finetune open-source small models as detectors via agent tuning. However, these methods are limited by fixed strategies. When faced with a dynamically changing execution environment, they may lack adaptability and inappropriately call tools, ultimately leading to detection failure. To address the problem of insufficient strategy adaptability, we propose the innovative ``Learning to Evaluate and Adaptively Plan''(LEAP) framework, which endows an efficient student model with the dynamic learning and proactive correction capabilities of the teacher model. Specifically, our method formulates the hallucination detection problem as a dynamic strategy learning problem. We first employ a teacher model to generate trajectories within the dynamic learning loop and dynamically adjust the strategy based on execution failures. We then distill this dynamic planning capability into an efficient student model via agent tuning. Finally, during strategy execution, the student model adopts a proactive correction mechanism, enabling it to propose, review, and optimize its own verification strategies before execution. We demonstrate through experiments on three challenging benchmarks that our LEAP-tuned model outperforms existing state-of-the-art methods.


Sign language recognition from skeletal data using graph and recurrent neural networks

arXiv.org Artificial Intelligence

Within this field, Isolated Sign Language Recognition (ISLR) focuses on recognizing single, context-free signs, which can be seen as the building blocks for continuous sign language understanding. While ISLR provides an environment to study the visual and temporal characteristics of signs, it also poses significant challenges due to intra-class variations, such as differences in signing speed, hand orientation, and signer-dependent styles. Addressing these challenges is essential for building reliable models that can generalize across diverse signers and recording conditions. Moreover, the recognition of isolated signs provides a basis for Continuous Sign Language Recognition (CSLR), where the boundaries between consecutive signs are not explicitly defined. ISLR is typically formulated as a classification problem, where each video segment corresponds to a single label representing a sign.


Lived Experience in Dialogue: Co-designing Personalization in Large Language Models to Support Youth Mental Well-being

arXiv.org Artificial Intelligence

We conducted three 90 - minute workshops at Talenthub Op Zuid, each with a different group of participants (total N=24, MAge =17.6, SD=1.2, see S upplement for additional details). In the first workshop, participants reviewed the prior 13 personas from Stage 1 and critiqued them for gaps in relevance. The scoping personas generated from survey and forum data gave youth stakeholders a concrete starting point for consulting as experts by experience in initial co - design activities. They challenged the realism of the scoping personas . Using fill - in - the - blank templates to guide but not restrict their persona creation (created by a youth member of the research team with design training, see Supplement), youth added contextual details to the project personas, such as daily routines, stressors, and digital habits, and brainstormed plausible backstories involving bullying, school difficulties, or parental conflict. The second workshop engaged a new participant group who expanded on previous outputs and addressed additional questions on living environment and emotional support needs, as this was suggested as relevant by youth from the prior workshop . Participants revised or created new personas b ased on their own or peers' experiences. In t he third workshop, a new group of participants again reviewed prior co - creation and outputs and further refined the personas .


Long Grounded Thoughts: Distilling Compositional Visual Reasoning Chains at Scale

arXiv.org Artificial Intelligence

Recent progress in multimodal reasoning has been driven largely by undisclosed datasets and proprietary data synthesis recipes, leaving open questions about how to systematically build large-scale, vision-centric reasoning datasets, particularly for tasks that go beyond visual math. In this work, we introduce a new reasoning data generation framework spanning diverse skills and levels of complexity with over 1M high-quality synthetic vision-centric questions. The dataset also includes preference data and instruction prompts supporting both offline and online RL. Our synthesis framework proceeds in two stages: (1) scale; and (2) complexity. Reasoning traces are then synthesized through a two-stage process that leverages VLMs and reasoning LLMs, producing CoT traces for VLMs that capture the richness and diverse cognitive behaviors found in frontier reasoning models. Remarkably, we show that finetuning Qwen2.5-VL-7B on our data outperforms all open-data baselines across all evaluated vision-centric benchmarks, and even surpasses strong closed-data models such as MiMo-VL-7B-RL on V* Bench, CV-Bench and MMStar-V. Perhaps most surprising, despite being entirely vision-centric, our data transfers positively to text-only reasoning (MMLU-Pro) and audio reasoning (MMAU), demonstrating its effectiveness. Similarly, despite not containing videos or embodied visual data, we observe notable gains when evaluating on a single-evidence embodied QA benchmark (NiEH). Finally, we use our data to analyze the entire VLM post-training pipeline. Our empirical analysis highlights that (i) SFT on high-quality data with non-linear reasoning traces is essential for effective online RL, (ii) staged offline RL matches online RL's performance while reducing compute demands, and (iii) careful SFT on high quality data can substantially improve out-of-domain, cross-modality transfer.


Elements of Active Continuous Learning and Uncertainty Self-Awareness: a Narrow Implementation for Face and Facial Expression Recognition

arXiv.org Artificial Intelligence

Reflection on one's thought process and making corrections to it if there exists dissatisfaction in its performance is, perhaps, one of the essential traits of intelligence. However, such high-level abstract concepts mandatory for Artificial General Intelligence can be modelled even at the low level of narrow Machine Learning algorithms. Here, we present the self-awareness mechanism emulation in the form of a supervising artificial neural network (ANN) observing patterns in activations of another underlying ANN in a search for indications of the high uncertainty of the underlying ANN and, therefore, the trustworthiness of its predictions. The underlying ANN is a convolutional neural network (CNN) ensemble employed for face recognition and facial expression tasks. The self-awareness ANN has a memory region where its past performance information is stored, and its learnable parameters are adjusted during the training to optimize the performance. The trustworthiness verdict triggers the active learning mode, giving elements of agency to the machine learning algorithm that asks for human help in high uncertainty and confusion conditions.


WATCHDOG: How universities are rebranding DEI to skirt Trump's crackdown

FOX News

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG .


Lice Checks, Crafts, and Being Touched by Strangers: Inside a Role-Playing ASMR Spa

WIRED

Tinglesbar incorporates ASMR into elementary school and doctor's visits simulations, offering a social haven for introverts. "It's time for your lice check," a woman who goes by "Ms. K" whispers directly into my ear as she starts running her fingers into my scalp and through each strand of hair. I'm in a dark room, Eastern flute music playing in the background as I sit across from my partner who's also having his hair caressed by a stranger. We close our eyes so we don't burst out laughing.


Training and Testing with Multiple Splits: A Central Limit Theorem for Split-Sample Estimators

arXiv.org Machine Learning

As predictive algorithms grow in popularity, using the same dataset to both train and test a new model has become routine across research, policy, and industry. Sample-splitting attains valid inference on model properties by using separate subsamples to estimate the model and to evaluate it. However, this approach has two drawbacks, since each task uses only part of the data, and different splits can lead to widely different estimates. Averaging across multiple splits, I develop an inference approach that uses more data for training, uses the entire sample for testing, and improves reproducibility. I address the statistical dependence from reusing observations across splits by proving a new central limit theorem for a large class of split-sample estimators under arguably mild and general conditions. Importantly, I make no restrictions on model complexity or convergence rates. I show that confidence intervals based on the normal approximation are valid for many applications, but may undercover in important cases of interest, such as comparing the performance between two models. I develop a new inference approach for such cases, explicitly accounting for the dependence across splits. Moreover, I provide a measure of reproducibility for p-values obtained from split-sample estimators. Finally, I apply my results to two important problems in development and public economics: predicting poverty and learning heterogeneous treatment effects in randomized experiments. I show that my inference approach with repeated cross-fitting achieves better power than previous alternatives, often enough to find statistical significance that would otherwise be missed.