AITopics

Value Profiles for Encoding Human Variation

Sorensen, Taylor, Mishra, Pushkar, Patel, Roma, Tessler, Michael Henry, Bakker, Michiel, Evans, Georgina, Gabriel, Iason, Goodman, Noah, Rieser, Verena

Modelling human variation in rating tasks is crucial for enabling AI systems for personalization, pluralistic model alignment, and computational social science. We propose representing individuals using value profiles -- natural language descriptions of underlying values compressed from in-context demonstrations -- along with a steerable decoder model to estimate ratings conditioned on a value profile or other rater information. To measure the predictive information in rater representations, we introduce an information-theoretic methodology. We find that demonstrations contain the most information, followed by value profiles and then demographics. However, value profiles offer advantages in terms of scrutability, interpretability, and steerability due to their compressed natural language format. Value profiles effectively compress the useful information from demonstrations (>70% information preservation). Furthermore, clustering value profiles to identify similarly behaving individuals better explains rater variation than the most predictive demographic groupings. Going beyond test set performance, we show that the decoder models interpretably change ratings according to semantic profile differences, are well-calibrated, and can help explain instance-level disagreement by simulating an annotator population. These results demonstrate that value profiles offer novel, predictive ways to describe individual variation beyond demographics or group information.

large language model, machine learning, value profile, (22 more...)

2503.15484

Country:

North America > United States > Washington > King County > Seattle (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(12 more...)

Genre: Research Report > New Finding (0.65)

Industry:

Social Sector (1.00)
Law > Statutes (1.00)
Law > Criminal Law (1.00)
(12 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Li, Haoyi, Yuan, Angela Yifei, Han, Soyeon Caren, Leckie, Christopher

SPADE: Systematic Prompt Framework for Automated Dialogue Expansion in Machine-Generated Text Detection

The increasing capability of large language models (LLMs) to generate synthetic content has heightened concerns about their misuse, driving the development of Machine-Generated Text (MGT) detection models. However, these detectors face significant challenges due to the lack of systematically generated, high-quality datasets for training. To address this issue, we propose five novel data augmentation frameworks for synthetic user dialogue generation through a structured prompting approach, reducing the costs associated with traditional data collection methods. Our proposed method yields 14 new dialogue datasets, which we benchmark against seven MGT detection models. The results demonstrate improved generalization performance when utilizing a mixed dataset produced by our proposed augmentation framework. Furthermore, considering that real-world agents lack knowledge of future opponent utterances, we simulate online dialogue detection and examine the relationship between chat history length and detection accuracy. We also benchmark online detection performance with limited chat history on our frameworks. Our open-source datasets can be downloaded from https://github.com/AngieYYF/SPADE-customer-service-dialogue.

large language model, machine learning, natural language, (22 more...)

2503.15044

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)
(13 more...)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Kuhn, Korbinian, Kersken, Verena, Zimmermann, Gottfried

Evaluating ASR Confidence Scores for Automated Error Detection in User-Assisted Correction Interfaces

Despite advances in Automatic Speech Recognition (ASR), transcription errors persist and require manual correction. Confidence scores, which indicate the certainty of ASR results, could assist users in identifying and correcting errors. This study evaluates the reliability of confidence scores for error detection through a comprehensive analysis of end-to-end ASR models and a user study with 36 participants. The results show that while confidence scores correlate with transcription accuracy, their error detection performance is limited. Classifiers frequently miss errors or generate many false positives, undermining their practical utility. Confidence-based error detection neither improved correction efficiency nor was perceived as helpful by participants. These findings highlight the limitations of confidence scores and the need for more sophisticated approaches to improve user interaction and explainability of ASR results.

confidence score, machine learning, natural language, (16 more...)

doi: 10.1145/3706599.3720038

2503.15124

Country:

North America > United States > Maryland > Baltimore (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
(32 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)

Bigata, Antoni, Stypułkowski, Michał, Mira, Rodrigo, Bounareli, Stella, Vougioukas, Konstantinos, Landgraf, Zoe, Drobyshev, Nikita, Zieba, Maciej, Petridis, Stavros, Pantic, Maja

KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation

Current audio-driven facial animation methods achieve impressive results for short videos but suffer from error accumulation and identity drift when extended to longer durations. Existing methods attempt to mitigate this through external spatial control, increasing long-term consistency but compromising the naturalness of motion. We propose KeyFace, a novel two-stage diffusion-based framework, to address these issues. In the first stage, keyframes are generated at a low frame rate, conditioned on audio input and an identity frame, to capture essential facial expressions and movements over extended periods of time. In the second stage, an interpolation model fills in the gaps between keyframes, ensuring smooth transitions and temporal coherence. To further enhance realism, we incorporate continuous emotion representations and handle a wide range of non-speech vocalizations (NSVs), such as laughter and sighs. We also introduce two new evaluation metrics for assessing lip synchronization and NSV generation. Experimental results show that KeyFace outperforms state-of-the-art methods in generating natural, coherent facial animations over extended durations, successfully encompassing NSVs and continuous emotions.

artificial intelligence, machine learning, natural language, (19 more...)

2503.01715

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
(13 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Graphics > Animation (0.94)
(2 more...)

Challenges and Trends in Egocentric Vision: A Survey

Li, Xiang, Qiu, Heqian, Wang, Lanxiao, Zhang, Hanwen, Qi, Chenghao, Han, Linfeng, Xiong, Huiyu, Li, Hongliang

With the rapid development of artificial intelligence technologies and wearable devices, egocentric vision understanding has emerged as a new and challenging research direction, gradually attracting widespread attention from both academia and industry. Egocentric vision captures visual and multimodal data through cameras or sensors worn on the human body, offering a unique perspective that simulates human visual experiences. This paper provides a comprehensive survey of the research on egocentric vision understanding, systematically analyzing the components of egocentric scenes and categorizing the tasks into four main areas: subject understanding, object understanding, environment understanding, and hybrid understanding. We explore in detail the sub-tasks within each category. We also summarize the main challenges and trends currently existing in the field. Furthermore, this paper presents an overview of high-quality egocentric vision datasets, offering valuable resources for future research. By summarizing the latest advancements, we anticipate the broad applications of egocentric vision technologies in fields such as augmented reality, virtual reality, and embodied intelligence, and propose future research directions based on the latest developments in the field.

data mining, large language model, machine learning, (23 more...)

2503.15275

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)
(13 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Leisure & Entertainment (1.00)
Information Technology (1.00)
Health & Medicine > Therapeutic Area (0.67)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Hardware (1.00)
(10 more...)

Qin, Zhenlin, Wang, Leizhen, Pereira, Francisco Camara, Ma, Zhenlinag

A Foundational individual Mobility Prediction Model based on Open-Source Large Language Models

Large Language Models (LLMs) are widely applied to domain-specific tasks due to their massive general knowledge and remarkable inference capacities. Current studies on LLMs have shown immense potential in applying LLMs to model individual mobility prediction problems. However, most LLM-based mobility prediction models only train on specific datasets or use single well-designed prompts, leading to difficulty in adapting to different cities and users with diverse contexts. To fill these gaps, this paper proposes a unified fine-tuning framework to train a foundational open source LLM-based mobility prediction model. We conducted extensive experiments on six real-world mobility datasets to validate the proposed model. The results showed that the proposed model achieved the best performance in prediction accuracy and transferability over state-of-the-art models based on deep learning and LLMs.

large language model, machine learning, natural language, (17 more...)

2503.16553

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
Asia > China > Hong Kong (0.05)
Europe > Denmark (0.04)
(6 more...)

Genre: Research Report > New Finding (0.86)

Industry: Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neuronal Activation States as Sample Embeddings for Data Selection in Task-Specific Instruction Tuning

Ma, Da, Shang, Gonghu, Chen, Zhi, Qin, Libo, Luo, Yijie, Pan, Lei, Fan, Shuai, Chen, Lu, Yu, Kai

Task-specific instruction tuning enhances the performance of large language models (LLMs) on specialized tasks, yet efficiently selecting relevant data for this purpose remains a challenge. Inspired by neural coactivation in the human brain, we propose a novel data selection method called NAS, which leverages neuronal activation states as embeddings for samples in the feature space. Extensive experiments show that NAS outperforms classical data selection methods in terms of both effectiveness and robustness across different models, datasets, and selection ratios.

large language model, machine learning, natural language, (14 more...)

2503.15573

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(2 more...)

Genre:

Workflow (0.71)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

ACE: A Cardinality Estimator for Set-Valued Queries

Sheng, Yufan, Cao, Xin, Zhao, Kaiqi, Fang, Yixiang, Qi, Jianzhong, Zhang, Wenjie, Jensen, Christian S.

Cardinality estimation is a fundamental functionality in database systems. Most existing cardinality estimators focus on handling predicates over numeric or categorical data. They have largely omitted an important data type, set-valued data, which frequently occur in contemporary applications such as information retrieval and recommender systems. The few existing estimators for such data either favor high-frequency elements or rely on a partial independence assumption, which limits their practical applicability. We propose ACE, an Attention-based Cardinality Estimator for estimating the cardinality of queries over set-valued data. We first design a distillation-based data encoder to condense the dataset into a compact matrix. We then design an attention-based query analyzer to capture correlations among query elements. To handle variable-sized queries, a pooling module is introduced, followed by a regression model (MLP) to generate final cardinality estimates. We evaluate ACE on three datasets with varying query element distributions, demonstrating that ACE outperforms the state-of-the-art competitors in terms of both accuracy and efficiency.

artificial intelligence, machine learning, natural language, (21 more...)

2503.14929

Country:

South America > Argentina (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(9 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
(2 more...)

Fast MLE and MAPE-Based Device Activity Detection for Grant-Free Access via PSCA and PSCA-Net

Tan, Bowen, Cui, Ying

Fast and accurate device activity detection is the critical challenge in grant-free access for supporting massive machine-type communications (mMTC) and ultra-reliable low-latency communications (URLLC) in 5G and beyond. The state-of-the-art methods have unsatisfactory error rates or computation times. To address these outstanding issues, we propose new maximum likelihood estimation (MLE) and maximum a posterior estimation (MAPE) based device activity detection methods for known and unknown pathloss that achieve superior error rate and computation time tradeoffs using optimization and deep learning techniques. Specifically, we investigate four non-convex optimization problems for MLE and MAPE in the two pathloss cases, with one MAPE problem being formulated for the first time. For each non-convex problem, we develop an innovative parallel iterative algorithm using the parallel successive convex approximation (PSCA) method. Each PSCA-based algorithm allows parallel computations, uses up to the objective function's second-order information, converges to the problem's stationary points, and has a low per-iteration computational complexity compared to the state-of-the-art algorithms. Then, for each PSCA-based iterative algorithm, we present a deep unrolling neural network implementation, called PSCA-Net, to further reduce the computation time. Each PSCA-Net elegantly marries the underlying PSCA-based algorithm's parallel computation mechanism with the parallelizable neural network architecture and effectively optimizes its step sizes based on vast data samples to speed up the convergence. Numerical results demonstrate that the proposed methods can significantly reduce the error rate and computation time compared to the state-of-the-art methods, revealing their significant values for grant-free access.

artificial intelligence, bayesian inference, machine learning, (18 more...)

doi: 10.1109/TWC.2025.3545649

2503.15259

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)
Oceania > Australia (0.04)
(4 more...)

Genre:

Research Report > Promising Solution (0.54)
Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)